That’s all of CS and IT.
Comment on What is OOP, really? Why so many different definitions?
halcyon@slrpnk.net 11 months agoOnce upon a time, “big data” was datasets large enough that it was impractical to try to store or work with them in a traditional relational database software. Which is where distributed storage structures came into play with the ability to spread both storage and computation across clusters of machines, using solutions like Hadoop and MongoDB. That seemed to be the direction things were heading 10 - 15 years ago.
However, with the automated scaling built into modern cloud databases, the line has gotten a bit blurry; Snowflake, Redshift, BigQuery all handle many billions of rows just fine. I probably wouldn’t use the term big data in a professional context these days, but there is a table size after which I write code a bit more carefully.
I suppose my point is that the term once meant something, but marketing stole it because it sounds cool. I worked in a tech shop in the late aughts where the sales team insisted on calling every rack mounted server a “blade server”, regardless of whether it had modular swappable boards. Because it sounded cool.
slacktoid@lemmy.ml 11 months ago
maynarkh@feddit.nl 11 months ago
How I remember it is that it’s not even the whole dataset that is too large, but the individual records. Hadoop for example is not doing anything magic, it’s just a software package to extend MySQL to be able to efficiently have pictures (Facebook’s original use case, of course it evolved) as records.
I guess big data is what you need it to justify what you want to justify. In one of my gigs’ case, it was public funding for a project.