1.9 KiB
1.9 KiB
BitNet LLM - Instructions
Getting Started
BitNet is a CPU-based large language model inference engine. This package provides SSH access so you can interact with the model directly.
Requirements
- CPU with AVX2 support (Intel Haswell or AMD Excavator or newer)
- Check support: SSH into your StartOS server and run
grep -o 'avx2' /proc/cpuinfo
Configuration
- After installation, go to Config tab
- Paste your SSH public key into the "SSH Authorized Keys" field
- Save configuration
- Restart the service
Connecting via SSH
Get your Tor address from the Interfaces tab, then:
ssh -o ProxyCommand="nc -x localhost:9050 %h %p" root@your-bitnet-address.onion
Or via LAN (if configured):
ssh root@your-server-ip -p <bitnet-ssh-port>
Using BitNet
Once connected via SSH:
# Run inference with default model
python3 /BitNet/run_inference.py -m ggml-model-i2_s.gguf -p "Your prompt here"
# See all options
python3 /BitNet/run_inference.py --help
# Use custom model (mount in /root/models)
python3 /BitNet/run_inference.py -m /root/models/your-model.gguf -p "Your prompt"
Custom Models
- Download GGUF models from Hugging Face
- Upload them via File Browser or SCP to
/root/models/ - Reference them in your inference commands
Performance
This runs entirely on CPU using AVX2 instructions. Performance depends on your server's CPU speed and core count. The default 2B parameter model is lightweight and should run reasonably well on modern hardware.
Troubleshooting
- Can't connect via SSH: Verify your public key is correctly configured
- "Illegal instruction" errors: Your CPU doesn't support AVX2
- Slow inference: Normal for CPU-based inference; consider a smaller model or faster CPU
Support
- Upstream: https://github.com/microsoft/BitNet
- Container: https://github.com/kth8/bitnet