Is “model” not defined as architecture+weights? Those models certainly don’t share the same architecture. I might just be confused about your point though
Is “model” not defined as architecture+weights? Those models certainly don’t share the same architecture. I might just be confused about your point though
communist@lemmy.frozeninferno.xyz 2 weeks ago
It is, but this did not prove all architectures cannot reason, nor did it prove that all sets of weights cannot reason.
essentially they did not prove the issue is fundamental. And they have a pretty similar architecture, they’re all transformers trained in a similar way.
0ops@lemm.ee 2 weeks ago
Ah, gotcha