educative.io

Can Cassandra or Big Table have a variable (huge) number of columns

Hi

This is in reference to the Instagram material and the suggestion of storing the photos posted by a user in Cassandra table (see below). The Twitter material had the similar choice.

My question is that a single user may have a huge number of Tweets or pictures (hundreds or thousands), is it still practical to store all their pictures or Tweets in Cassandra.

From the course:
We need to store relationships between users and photos, to know who owns which photo. We also need to store the list of people a user follows. For both of these tables, we can use a wide-column datastore like Cassandra. For the ‘UserPhoto’ table, the ‘key’ would be ‘UserID’ and the ‘value’ would be the list of ‘PhotoIDs’ the user owns, stored in different columns. We will have a similar scheme for the ‘UserFollow’ table.

Thanks

I believe we are only storing the meta data of images in these tables and all images are being stored on S3.

You can opt any other database, its just it provides you the great features like undeleting as mentioned over other datastores.

Great, thanks. It does look like BigTable allows a very high number of columns per column family, and we can store all the meta-data or even the Tweet text. I was thinking people may tweet 10 times per day and over 10 years that becomes ~30,000 tweets and BigTable should be able to store that many columns.

I understand that pictures or multi-media is stored in HDFS/S3 etc.

Moreover, I think each column can be fetched on its own by providing the row index and column index, if I follow BigTable architecture correctly.