Test result

by krustik - opened about 16 hours ago

about 16 hours ago

Model in GGUF best quality Q8 uses 539Gb RAM, which is unusually high compared to storage size (usually it's that +10% in GGUFs).
It was a mistake to use as basis of this product a Meta's Llama 3 405B model, which is really bad model in my tests, it's lobotomized incredibly by censorship to the level of uselessness (by that i mean on request to do reasoning or etc the model report standard censoring decline and on many similar topics) and was incredibly slow, we wasn't able to compare it by speed for long time, but after large Deepseek 671B, which is fast even on only CPU setups, we see how badly Meta's Llama 3 405B was made.
By result of using Llama 405B this Tulu 3 model is incredibly slow and sluggish - x10 times slower than Deepseek V3 or R1 671B-Q6 on same hardware (0.70 token/sec deepseek R1-Q6, 0.07 token/sec for tulu3/llama-Q8).
Quality, unfortunately even on Q8 on my tests with repairing Chuck code or creating a Mozart piece it failed and produce broken code. Deepseek V2.5-Q8 which is just 235B model was able to repair code, better if it was used as basis for Tulu or Deepseek V3/R1 which in Q6 uses 567Gb RAM.

Testing made on 20 cores Xeon - CPU-only setup, on enterprise level Gigabyte motherboard with 12 RAM slots, in oobabooga (text-generation-webui 2.4) llm launcher, GGUF Q8 model by Bartowski.

amanrangapur

Ai2 org about 16 hours ago

•

edited about 16 hours ago

Hey @krustik , appreciate the detailed feedback. The high RAM usage on GGUF Q8 is definitely something worth looking into. Regarding speed and reasoning capabilities, performance can vary a lot based on architecture, quantization, and runtime optimizations, especially on CPU-only setups. Would love to see some example prompts where it struggled, that’d help us understand what’s going on.

krustik

about 16 hours ago

Hey @krustik , appreciate the detailed feedback. The high RAM usage on GGUF Q8 is definitely something worth looking into. Regarding speed and reasoning capabilities, performance can vary a lot based on architecture, quantization, and runtime optimizations, especially on CPU-only setups. Would love to see some example prompts where it struggled, that’d help us understand what’s going on.

My test is simple, any Mozart piece in ChucK language, for now the Deepseek R1 managed to create code without errors in first attempt but only when i've used it's own reasoning from failed attempt as the new prompt.
The repairing of logical problems in simple ASMR ChucK piece is hard for most models for now. In this Tulu3 i waited ~2 hours for finishing this task, in Deepseek R1 it can take maybe 30 minutes for same task.
Broken code is:

i have this problem:
<compiled.code>:2:21: syntax error
[2] int NUM_OSCILLATORS = 8;

in this code below. Please repair or improve it and write full repaired code.
The code:

// Define some constants for the number of oscillators and effects
int NUM_OSCILLATORS = 8;
int NUM_EFFECTS = 3;

// Array to hold oscillators
SinOsc s[NUM_OSCILLATORS] @=> dac;
PulseOsc p[NUM_OSCILLATORS] @=> dac;
SquareOsc q[NUM_OSCILLATORS] @=> dac;

// Array to hold effects
Reverb r[NUM_EFFECTS];
Delay d[NUM_EFFECTS];

// Initialize oscillators with random frequencies and amplitudes
for (int i = 0; i < NUM_OSCILLATORS; i++) {
Math.random2f(150.0, 300.0) => s[i].freq; // Whispers range
Math.random2f(0.05, 0.1) => s[i].gain;

Math.random2f(100.0, 300.0) => p[i].freq; // Finger snaps range
Math.random2f(0.5, 0.8) => p[i].gain;

Math.random2f(50.0, 150.0) => q[i].freq; // Tapping/scratching range
Math.random2f(0.05, 0.1) => q[i].gain;
Copy
}

// Initialize effects with random parameters
for (int i = 0; i < NUM_EFFECTS; i++) {
Reverb @=> r[i];
Delay @=> d[i];

Math.random2f(0.01, 0.1) => r[i].mix;
Math.random2f(0.5, 1.0) => r[i].gain;

Math.random2f(0.05, 0.1) => d[i].delay;
Math.random2f(0.3, 0.6) => d[i].feedback;
Copy
}

// Connect oscillators to effects
for (int i = 0; i < NUM_OSCILLATORS; i++) {
int effectIndex = Math.randomInt(NUM_EFFECTS);
s[i] => r[effectIndex];
p[i] => d[effectIndex];
q[i] => r[(effectIndex + 1) % NUM_EFFECTS];
}

// Function to modulate oscillator parameters over time
fun void modulateOscillators() {
while (true) {
for (int i = 0; i < NUM_OSCILLATORS; i++) {
Math.random2f(150.0, 300.0) => s[i].freq;
Math.random2f(0.05, 0.1) => s[i].gain;

Math.random2f(100.0, 300.0) => p[i].freq;
Math.random2f(0.5, 0.8) => p[i].gain;

Math.random2f(50.0, 150.0) => q[i].freq;
Math.random2f(0.05, 0.1) => q[i].gain;
Copy
}

// Randomly change connections between oscillators and effects
for (int i = 0; i < NUM_OSCILLATORS; i++) {
int effectIndex = Math.randomInt(NUM_EFFECTS);
s[i] => r[effectIndex];
p[i] => d[effectIndex];
q[i] => r[(effectIndex + 1) % NUM_EFFECTS];
}

// Sleep for a random duration
Math.random2f(0.5, 2.0)::second => now;
Copy
}
Copy
}

// Start modulating oscillators in a separate shred
modulateOscillators() @=> SPorkID modulator;

// Function to add soft whispers
fun void addWhispers() {
while (true) {
SinOsc whisper => dac;
Math.random2f(150.0, 300.0) => whisper.freq; // Whispers range
Math.random2f(0.05, 0.1) => whisper.gain;

// Whisper for a short duration
Math.random2f(0.5, 1.0)::second => now;

// Turn off the whisper
0 => whisper.gain;
Copy
}
Copy
}

// Start adding whispers in a separate shred
addWhispers() @=> SPorkID whisperer;

// Function to add finger snapping
fun void addFingerSnaps() {
while (true) {
PulseOsc snap => dac;
Math.random2f(100.0, 300.0) => snap.freq; // Finger snaps range
Math.random2f(0.5, 0.8) => snap.gain;

// Snap for a short duration
0.1::second => now;

// Turn off the snap
0 => snap.gain;
Copy
}
Copy
}

// Start adding finger snaps in a separate shred
addFingerSnaps() @=> SPorkID snapper;

// Function to add subtle tapping or scratching noises
fun void addTappingScratching() {
while (true) {
SquareOsc tap => dac;
Math.random2f(50.0, 150.0) => tap.freq; // Tapping/scratching range
Math.random2f(0.05, 0.1) => tap.gain;

// Tap for a short duration
Math.random2f(0.3, 0.6)::second => now;

// Turn off the tap
0 => tap.gain;
Copy
}
Copy
}

// Start adding tapping or scratching noises in a separate shred
addTappingScratching() @=> SPorkID tapper;

// Function to add background music (soothing tones)
fun void addBackgroundMusic() {
SinOsc bgMusic => dac;
Math.random2f(80.0, 120.0) => bgMusic.freq; // Soothing range
Math.random2f(0.5, 1.0) => bgMusic.gain;

while (true) {
// Change frequency and gain occasionally
Math.random2f(80.0, 120.0) => bgMusic.freq;
Math.random2f(0.5, 1.0) => bgMusic.gain;

// Sleep for a longer duration
5::second => now;
Copy
}
Copy
}

// Start adding background music in a separate shred
addBackgroundMusic() @=> SPorkID bgMusicer;

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment