gpjt's submissions | Hacker News

1.		Flax debugging: making a hash of things (gilesthomas.com)
		2 points by gpjt 2 days ago \| past \| discuss
2.		10Gb/s Ethernet: switching to a Broadcom SFP+ module (gilesthomas.com)
		189 points by gpjt 2 days ago \| past \| 165 comments
3.		Jax: Commitment Issues (gilesthomas.com)
		4 points by gpjt 3 days ago \| past \| discuss
4.		Jax Back Ends and Devices (gilesthomas.com)
		2 points by gpjt 13 days ago \| past \| discuss
5.		Using Safetensors with Flax (gilesthomas.com)
		2 points by gpjt 14 days ago \| past
6.		First Looking into Jax (gilesthomas.com)
		3 points by gpjt 19 days ago \| past
7.		10Gb/s Ethernet: using mini-heatsinks with a 10GBASE-T SFP+ module (gilesthomas.com)
		3 points by gpjt 31 days ago \| past
8.		10Gb/s Ethernet: what I did to get it working in my home (gilesthomas.com)
		232 points by gpjt 50 days ago \| past \| 177 comments
9.		10Gb Ethernet: what I had to (re)learn (gilesthomas.com)
		1 point by gpjt 51 days ago \| past \| 1 comment
10.		LLM from scratch, part 33 – what I learned from the appendices (gilesthomas.com)
		5 points by gpjt 57 days ago \| past
11.		LLM from scratch (32l) – Interventions: updated instruction fine-tuning results (gilesthomas.com)
		1 point by gpjt 59 days ago \| past
12.		How an LLM becomes more coherent as we train it (gilesthomas.com)
		3 points by gpjt 62 days ago \| past
13.		LLM from scratch, part 32k – Interventions: gradient accumulation (gilesthomas.com)
		2 points by gpjt 64 days ago \| past
14.		Provision: LLM-powered server setup from Markdown (provision.sh)
		2 points by gpjt 69 days ago \| past
15.		LLM from scratch, part 32j – trying to train a better model in the cloud (gilesthomas.com)
		2 points by gpjt 70 days ago \| past
16.		Writing an LLM from scratch, part 32i – Interventions: what is in the noise? (gilesthomas.com)
		1 point by gpjt 72 days ago \| past
17.		Writing an LLM from scratch, part 32h – Interventions: full fat float32 (gilesthomas.com)
		7 points by gpjt 76 days ago \| past
18.		Writing an LLM from scratch, part 32g – Interventions: weight tying (gilesthomas.com)
		2 points by gpjt 86 days ago \| past
19.		Writing an LLM from scratch, part 32f – Interventions: weight decay (gilesthomas.com)
		6 points by gpjt 87 days ago \| past
20.		Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com)
		3 points by gpjt 3 months ago \| past
21.		Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com)
		6 points by gpjt 4 months ago \| past
22.		Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)
		1 point by gpjt 4 months ago \| past
23.		Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)
		2 points by gpjt 4 months ago \| past
24.		Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com)
		1 point by gpjt 4 months ago \| past
25.		Getting a Custom PyTorch LLM onto the Hugging Face Hub (gilesthomas.com)
		1 point by gpjt 4 months ago \| past
26.		Writing an LLM from scratch, part 31 – the models are now on Hugging Face (gilesthomas.com)
		2 points by gpjt 5 months ago \| past
27.		Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results (gilesthomas.com)
		1 point by gpjt 5 months ago \| past
28.		LLM from scratch, part 29 – using DDP to train a base model in the cloud (gilesthomas.com)
		2 points by gpjt 5 months ago \| past
29.		LLM from scratch, part 28 – training a base model from scratch on an RTX 3090 (gilesthomas.com)
		540 points by gpjt 6 months ago \| past \| 121 comments
30.		Writing an LLM from scratch, part 27 – what's left, and what's next? (gilesthomas.com)
		1 point by gpjt 7 months ago \| past
		More