0 0 votes
Article Rating

the Dell G5 SE 5505 laptop has AMD Ryzen 5 4600H CPU with 6 cores, 8GB RAM and a AMD RX 5600m GPU.

  1. install the ubuntu 25.04 server(https://releases.ubuntu.com/25.04/ubuntu-25.04-live-server-amd64.iso). it’s pretty straight forward, just download it, using rufus write the iso file to a usb stick, then boot from that usb stick to install the ubuntu. there’s one tip: sometimes the installation got errors and failed when it tried to deal with the storage. the solution is to format the storage first.
    using Ctrl + Alt + F2 to get into shell, then clear the storage by using: sudo dd if=/dev/zero of=/dev/nvme0n1 bs=1M count=10 . Ctrl + Alt + F1 back to installation UI.
  2. After installed Ubuntu, let’s install llama.cpp. Because AMD rx 5600m is an old GPU, I selected vulkan over rocm. install vulkan drivers and vulkan sdk. then download llama.cpp. run: cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=release, cmake –build build –config release
  3. download Qwen3-4B-Instruct-2507-Q4_K_M.gguf, run llama.cpp: llama-server -m ~/models/Qwen3-4B-Instruct-2507-Q4_K_M.gguf -ngl 33 –host 0.0.0.0 –batch-size 2048 –ctx-size 8192 –flash-attn –no-webui –temp 0.3 –top-k 33
  4. Prefill: 140 tokens per second, inference: 20 tokens per second.

Categories: Blog

Chris

Chris

Just me, need more info?

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x