# BitNet LLM - Instructions ## Getting Started BitNet is a CPU-based large language model inference engine. This package provides SSH access so you can interact with the model directly. ### Requirements - **CPU with AVX2 support** (Intel Haswell or AMD Excavator or newer) - Check support: SSH into your StartOS server and run `grep -o 'avx2' /proc/cpuinfo` ### Configuration 1. After installation, go to **Config** tab 2. Paste your SSH public key into the "SSH Authorized Keys" field 3. Save configuration 4. Restart the service ### Connecting via SSH Get your Tor address from the **Interfaces** tab, then: ```bash ssh -o ProxyCommand="nc -x localhost:9050 %h %p" root@your-bitnet-address.onion ``` Or via LAN (if configured): ```bash ssh root@your-server-ip -p ``` ### Using BitNet Once connected via SSH: ```bash # Run inference with default model python3 /BitNet/run_inference.py -m ggml-model-i2_s.gguf -p "Your prompt here" # See all options python3 /BitNet/run_inference.py --help # Use custom model (mount in /root/models) python3 /BitNet/run_inference.py -m /root/models/your-model.gguf -p "Your prompt" ``` ### Custom Models 1. Download GGUF models from Hugging Face 2. Upload them via File Browser or SCP to `/root/models/` 3. Reference them in your inference commands ### Performance This runs entirely on CPU using AVX2 instructions. Performance depends on your server's CPU speed and core count. The default 2B parameter model is lightweight and should run reasonably well on modern hardware. ## Troubleshooting - **Can't connect via SSH**: Verify your public key is correctly configured - **"Illegal instruction" errors**: Your CPU doesn't support AVX2 - **Slow inference**: Normal for CPU-based inference; consider a smaller model or faster CPU ## Support - Upstream: https://github.com/microsoft/BitNet - Container: https://github.com/kth8/bitnet