6 Comments
User's avatar
Nick Santos's avatar

Hi Christopher, I really enjoyed reading this and was nodding along until the end when you addressed dataset availability. I think you're pointing out that Earth Engine gives users the ability to skip the initial ETL and just get straight to analysis and prototyping of workflows - something that's still time consuming with most other platforms. But your conclusion goes straight from acknowledging that capability to "Earth Engine is too costly".

If that's Google's secret sauce, so to speak, what value could we reasonably assign it? For the large majority of customers who I'd bet (citation needed) aren't running a global scale model, so aren't factoring compute of that size, I wonder if the ability to jump straight to piloting and prototyping an analysis with specific datasets has huge value. Then, they can decide if they want to run it in Earth Engine itself, or continue building an ingestion pipeline somewhere else.

For that kind of use case, I'd guess it's *not* too expensive, but your article proves it's missing a customer there, who *does* want to use Earth Engine for the full compute, but can more economically perform the work elsewhere.

Anyway, I'd be curious for a bit more of your thoughts on how the dataset availability you touch on at the end could or should factor into the equation here.

Expand full comment
Christopher Ren's avatar

Hi Nick, first of all thank you so much for commenting, I think it's really valuable to have these discussions out in the open.

"For the large majority of customers who I'd bet (citation needed) aren't running a global scale model, so aren't factoring compute of that size, I wonder if the ability to jump straight to piloting and prototyping an analysis with specific datasets has huge value."

This is a very interesting point! Perhaps it is not underpriced for those customers. However if you measure customers in terms of dollars spent, and not in terms of number, then losing out on the big scaling customers could be a big loss.

I guess I wasn't clear in the post: I think GEE is too costly **along certain dimensions** particularly when it comes to scaling analyses. The reason I think this is an issue is because ultimately GEE charges a premium on top of compute + storage: in exchange you skip the initial ETL as you mentioned. IMO, if you are in the business of charging premiums for compute and storage, then you are in the business of incentivizing your customers to use your platform as much as possible, not use it for exploration and then transition to something else when it finally comes time to actually spend money.

Here is a link to a comment someone left on linkedin discussing this as a GEE customer(https://www.linkedin.com/feed/update/urn:li:ugcPost:7346417267691180033?commentUrn=urn%3Ali%3Acomment%3A%28ugcPost%3A7346417267691180033%2C7346418372982497283%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287346418372982497283%2Curn%3Ali%3AugcPost%3A7346417267691180033%29)

I wrote this article because I felt that the current GEE pricing structure did not encourage customers to do that, but it's also just my opinion and I don't have visibility into all the internal pressures the GEE team might face when deciding on pricing!

With respect to the dataset availability I touched on: I think it can be a double edged sword. It is obviously invaluable to be able to overlay/intersect data with other geospatial data. However, you are also reliant on the GEE ingestion team, and may not be able to access data as quickly as you need to if you want it ASAP. I think we both agree that GEE is currently undefeated when it comes to exploratory/prototyping workflows, so perhaps a sensible change in pricing structure would be to reduce the EECU-s cost in the higher usage tiers to entice customers to run their workflows at scale on the platform, but keep the lower tiers similar?

Expand full comment
Adithya Kothandhapani's avatar

Hi Chris, awesome and timely post! Are the annual cloud costs baked into the 0.2FTE to maintain the custom pipeline?

Expand full comment
Christopher Ren's avatar

Thanks Adithya, glad to hear it's somewhat useful although please do double check the numbers if you're making business decisions based on this!

The 0.2FTE is only engineer time, it does not take into account cloud costs for compute/data storage/transfer etc...

Expand full comment
Adithya Kothandhapani's avatar

I was actually comparing against an estimate we did last year at SkyServe - where we estimated the running costs of a custom pipeline using AWS at the back. Adjusting for the case you presented, we got $24.5k/year paid to Amazon. That increases the slope of the custom pipeline plot in the comparison then.. right? Especially as you remove the subscription component from the GEE case.

Here’s the doc: https://cdn.prod.website-files.com/660e7952369feece9a6c0e45/66cc7b8cf3049d0c922d444a_Cost%20Estimation%20Case%20Study.pdf

Expand full comment
Christopher Ren's avatar

Ah I think if your costs are $24.5k for global inference, you land squarely in the estimated range of $15k - $35k which we use as our estimate for the custom pipeline! The slope is probably close to the mean case.

Expand full comment