educative.io

Educative

How the unique strings calculation is happening with base64 encoding

I am going through this lesson, in that there is the following statement

With 1M pastes every day we will have 3.6 billion Pastes in 10 years. We need to generate and store keys to uniquely identify these pastes. If we use base64 encoding ([A-Z, a-z, 0-9, ., -]) we would need six letters strings:
64^6 ~= 68.7 billion unique strings

My question is like how we are coming up with 64^6 calculation, I am not getting this math under the hood, I got the same doubt for the URL service lesson also, can anyone explain the math behind it, please?

Thanks in Advance

Hi @Karthik_Vg,
The math is simple as we just need to make sure that the number of unique keys is large enough to identify the huge number of pastes independently.

Now we say we use base64 encoding to generate the keys. The calculation is what length should each key be so that we get at least 3.6 billion unique patterns. By reverse engineering:

key length = log_64 (3.6 billion) = 5.something

We round this 5.something to 6 and that is the length we should set. The log base 64 comes from base64 encoding as each byte has 64 different possibilities.

I hope I have cleared your confusion. If there are any further questions, do not hesitate to ask.