The best way to estimate deep storage capacity requirements is to run a test by indexing a representative sample of data and observe the output size of the compressed segments in deep storage.
Once it's figured out, admins can multiply it accordingly based on volume or number of retention days etc. The final ratio will vary significantly based on the characteristics of raw data.
Similar estimation method can be applied for the segment cache on the local storage, except that the segment cache is uncompressed. Simply get the information from the unreplicated size on the coordinator (in the datasource view) and multiply it by your replication factor (default 2).