Off-topic, but our freemium website is under attack by headless browsers. The fr...

austincheney · on Jan 27, 2022

Browser automation will occur by executing events in the DOM or by calling properties of the page/window. It’s all JavaScript designed for user interaction executed by a bot.

The one event that cannot be automated is cursor movement/position. Put a check into your event handlers that check that the cursor is actually over the event target.

mesadb · on Feb 1, 2022

You are right every testing solutions out there push UIEvents to the page rather than clicking with an actual mouse. That's why puppeteer, selenium etc are scraping tools not testing tools

darkstar999 · on Jan 27, 2022

That sounds like an accessibility problem.

austincheney · on Jan 27, 2022

Use an alternate control for keyboard navigation that is visually hidden and is accessed only by tab focus.

headlessvictim2 · on Jan 27, 2022

This is interesting. Thanks for sharing.

Are you saying block form submission unless the cursor is over the event target?

If so:

* How to handle legitimate requests from mobile users?

* How to handle form submissions with the "return" key?

austincheney · on Jan 27, 2022

Mobile users will use touch events instead of click events and likely your interface will be different and the screen width will be different. Check for these things along with keywords from the user agent string to determine mobile users from other users.

Return key on a control in a form will fire a submit event. Check for cursor position in your submit handler.

slig · on Jan 27, 2022

Just block the AWS ASN on CF, it's nor worth fighting.

rob-olmos · on Jan 27, 2022

+1 and GCP, and many other hosting ASNs

1vuio0pswjnm7 · on Jan 28, 2022

Tell freemium users what is the acceptable rate for requests per second. Publish the allowable rate on the website. Ban freemium user IPs that exceed the allowable rate. This can be done using a proxy.

headlessvictim2 · on Jan 28, 2022

A proxy like Cloudflare or a custom proxy that stores data?

Are there proxy examples you could point us to?

Thanks for your help.

1vuio0pswjnm7 · on Jan 28, 2022

https://www.haproxy.com/blog/four-examples-of-haproxy-rate-l...

headlessvictim2 · on Jan 28, 2022

Thank you for this.

cmeacham98 · on Jan 27, 2022

100 requests/second isn't that much, especially if you're fronting your website with Cloudflare. Do you have some unauthenticated endpoint(s) that eat up a ton of server CPU?

headlessvictim2 · on Jan 27, 2022

Thanks for the reply!

The freemium service provides access to machine learning models on GPU instances, served with FastAPI.

Each request invokes a compute-intensive ML model, but perhaps there is something wrong with the FastAPI configuration as well?

tempest_ · on Jan 27, 2022

It could be.

I watch the FastAPI repos a lot and tones of people do not understand how async python works and put their models with sync code in an async context.

headlessvictim2 · on Jan 27, 2022

Consider us one. :)

We tried removing "async" -- thinking it would force sequential processing -- but it unexpectedly seemed to cause parallel processing of requests, which caused CUDA memory errors.

Before removing "async", this is the weird behavior we observed:

* Hacker blasts 50-100 requests.

* Our ML model processes each request in normal time and sequentially.

* But instead of returning individual responses immediately, the server holds onto all responses -- sending responses only when the last request finishes (or a bunch of requests finish).

* Normally, request 1 should return in N seconds, request 2 in 2N seconds, but with this, all requests returned in about N50 seconds (assuming batch size of 50).

1. Any suggestions on this?

2. Mind clarifying how sync vs aync works? The FastAPI docs are unclear.

Any help would be much appreciated.

This has been extremely frustrating.

Nextgrid · on Jan 27, 2022

Any chance the entire thing can be offloaded to a task queue (Celery/etc)? This would decouple the HTTP request processing from the actual ML task.

The memory errors you're seeing could suggest that you may not actually be able to run multiple instances of the model, and even if you could it may not actually give you more performance than processing sequentially.

Seems like ultimately your current design can't gracefully handle too many concurrent requests, legitimate or malicious - this is a problem I recommend you address regardless of whether you manage to ban the malicious users.

pjgalbraith · on Jan 28, 2022

Yeah this is the way.

@headlessvictim2 search for "Asynchronous Request-Reply pattern" if you want more information about this kind of architecture. You will remove any bottleneck from the API server and can easily scale out from the task queue.

headlessvictim2 · on Jan 28, 2022

Thanks for the suggestion.

How would this work with GPU-bound machine learning models?

The model processing takes > 30 seconds and would still represent the bottleneck?

pjgalbraith · on Jan 28, 2022

You would still have the same bottleneck but the API request would return straight away with some sort of correllation ID. Then the workers that handle the GPU bound tasks would pull jobs when they are ready. If you get a lot of jobs all that will happen is the queue will fill up and the clients will wait longer and hit the status endpoint a few more times.

Here is an example of what it could look like: https://docs.microsoft.com/en-us/azure/architecture/patterns...

headlessvictim2 · on Jan 28, 2022

Thanks for the explanation.

Right now, we use ELB (Elastic Load Balancer) to sit in front of multiple GPU instances.

Is this sufficient or do you suggest adding Celery into this architecture?

tempest_ · on Jan 28, 2022

Python async is co-operative multi-tasking (as opposed to per-emptive)

There is an event loop that goes through all the tasks and runs them.

The issue is the event loop can only move on to the next task when you reach an await. So if you run a lot of code (say an ML model) between awaits no other task can advance during this time.

This is why it is co-operative, it is up to a task to release the event loop, by hitting an await, so other tasks can get work done.

This is fine when you have async libs that often hit awaits at things that are IO related like say db, or http calls.

FastAPI will spawn controllers that are not defined as async functions on a thread pool but it is still a python so GIL and all that.

You should do as the sibling comment says and decouple your http from your ML and feed the ML with something like Celery. This way your server is always there to respond to things (even if just a 429) to hit a cache or whatever else.

headlessvictim2 · on Jan 28, 2022

Thanks for the FastAPI explanation. This makes sense.

Right now, we use ELB (Elastic Load Balancer) to sit in front of multiple GPU instances.

Is this sufficient or do you suggest adding Celery to this architecture?

azinman2 · on Jan 27, 2022

What are the bots goals? Curious

headlessvictim2 · on Jan 27, 2022

To use premium settings without paying.

It appears less like malicious DDoS and more like pragmatic theft.

Nextgrid · on Jan 27, 2022

Could you add a symbolic, one-time fee to the "free" tier to deter multiple accounts and then implement reasonable rate-limits per-account?

synergy20 · on Jan 27, 2022

what's your site? would like to play with it

forgotmyoldacc · on Jan 27, 2022

Why not ReCAPTCHA?

headlessvictim2 · on Jan 27, 2022

Thanks for the suggestion.

It is possible, but this degrades the experience for legitimate users.

We prefer solving this without impacting/taxing normal users if possible.

readyplayeremma · on Jan 27, 2022

Just add the captcha only for requests coming from the problematic ASNs, like AWS.

edit: Actually, since you use CF, just make a firewall rule that forces the captcha for those ASNs before it even gets to your app. They have a field named "ip.geoip.asnum" for that, and an action called "challenge" which will force a captcha.

rabi_molar · on Jan 27, 2022

Perhaps Captcha?

headlessvictim2 · on Jan 27, 2022

Thanks for the suggestion.

It is possible, but this degrades the experience for legitimate users.

We prefer solving this without impacting/taxing normal users if possible.

geraldwhen · on Jan 27, 2022

Recaptcha v3 doesn’t prompt if it thinks you’re a real user.

Nextgrid · on Jan 27, 2022

This could have major GDPR implications if that's something the parent cares about. ReCaptcha is basically Google spyware that happens to provide captcha services.

waynesonfire · on Jan 28, 2022

it's obvious the parent is trolling.

machiaweliczny · on Jan 28, 2022