Comment on How does this pic show that Elon Musk doesnt know SQL?
nednobbins@lemm.ee 6 days ago
It’s so basic that documentation is completely unnecessary.
“De-duping” could mean multiple things, depending on what you mean by “duplicate”.
It could mean that the entire row of some table is the same. But that has nothing to do with the kind of fraud he’s talking about. Two people with the same SSN but different names wouldn’t be duplicates by that definition, do “de-duping” wouldn’t remove it.
It can also mean that a certain value shows up more than once (eg just the SSN). But that’s something you often want in database systems. A transaction log of SSN contributions would likely have that SSN repeated hundreds of times. It has nothing to do with fraud, it’s just how you record that the same account has multiple contributions.
A database system as large as the SSA has needs to deal with all kinds of variations in data (misspellings, abbreviations, moves, siblings, common names, etc). Something as simplistic as “no dupes anywhere” would break immediately.
MathiasTCK@lemmy.world 6 days ago
SSN is also not a valid unique key, there have been situations with multiple people issued the same SSN:
en.wikipedia.org/wiki/Social_Security_number
DacoTaco@lemmy.world 6 days ago
Just read the format of the us ssn in that wikipedia. That wasnt a smart format to use lol. Only supports 99*999 ( +/- 100k ) people per area code. No wonder numbers are reused.
In some countries its birthday+sequence number encoded with gender+checksum and that has been working since the 80’s.
Before that was a different number, but it wasnt future proof like the us ssn so we migrated away in the 80’s :')
Wispy2891@lemmy.world 6 days ago
In my country the only way that someone has the same number is if someone was born on the same day (±1 century), in the same city and has the same name and family name. Is extremely difficult to have duplicates in that way (exception: immigrants, because the “city code” is the same for the whole foreign country, so it’s not impossible that there are two Ananya Gupta born on the same day in the whole India)
DacoTaco@lemmy.world 6 days ago
Oh ye, our system wouldnt fit india as its limited to 500 births a day ( sequence is 3, digits and depending if its even or uneven describes your gender ). Your system seems fine and beats the us system hands down haha
nednobbins@lemm.ee 5 days ago
Yeah. And the fix for that has nothing to do with “de-duping” as a database operation either.
The main components would probably be:
There’s a lot of complication in each of those steps but none of them are particularly dependant on “de-duped” databases.