Reading time: 2 minutes

Cloudflare taisyklė atbaidyti botams



newpost cloudflare taisykle botu pristabdymui



Galbūt kada ieškojai Cloudflare Firewall taisyklės botų pristabdymui? Na, pateikiu ją čia, nes kaip žinoma, dauguma piktybinių botų nesilaiko robots.txt direktyvų. O ši taisyklė padės atbaidyti daugumą tokio pobūdžio botų (bet tikrai ne visus, nes tai neįmanoma).

Kas čia daroma? Na, sutikrinama kiekviena užklausa, o jos User-Agent identifikacinis aprašas transformuojamas į lowercase (mažąsias raides). Jei User-Agent aprašas turi bent vieną iš nurodytų frazių, botas blokuojamas.

Kaip ir minėjau, tai nepadeda nuo absoliučiai visų botų, bet atmuša tuos, kurie naudoja numatytuosius User-Agent aprašus pagal nutylėjimą.

(lower(http.user_agent) contains "curl") or (lower(http.user_agent) contains
"java") or (lower(http.user_agent) contains "python") or (lower(http.user_agent)
eq "") or (lower(http.user_agent) contains "go-http-client") or
(lower(http.user_agent) contains "apache-httpclient") or (lower(http.user_agent)
contains "headlesschrome") or (lower(http.user_agent) contains "phantomjs") or
(lower(http.user_agent) contains "axios") or (lower(http.user_agent) contains
"scrapy") or (lower(http.user_agent) contains "urllib") or
(lower(http.user_agent) contains "puppeteer") or (lower(http.user_agent)
contains "zombie") or (lower(http.user_agent) contains "mysuperuseragent") or
(lower(http.user_agent) contains "faraday") or (lower(http.user_agent) contains
"aiohttp") or (lower(http.user_agent) contains "httpx") or
(lower(http.user_agent) contains "libwww-perl") or (lower(http.user_agent)
contains "httpunit") or (lower(http.user_agent) contains "nutch") or
(lower(http.user_agent) contains "phpcrawl") or (lower(http.user_agent) contains
"mechanicalsoup") or (lower(http.user_agent) contains "geturl") or
(lower(http.user_agent) contains "semrushbot") or (lower(http.user_agent)
contains "ahrefsbot") or (lower(http.user_agent) contains "uptimerobot") or
(lower(http.user_agent) contains "petalbot") or (lower(http.user_agent) contains
"aspiegelbot") or (lower(http.user_agent) contains "dotbot") or
(lower(http.user_agent) contains "leechftp") or (lower(http.user_agent) contains
"masscan") or (lower(http.user_agent) contains "facebookscraper") or
(lower(http.user_agent) contains "phpcrawl") or (lower(http.user_agent) contains
"majestic") or (lower(http.user_agent) contains "linkbot") or
(lower(http.user_agent) contains "extractor") or (lower(http.user_agent)
contains "download") or (lower(http.user_agent) contains "scrape") or
(lower(http.user_agent) contains "stats") or (lower(http.user_agent) contains
"harvest") or (lower(http.user_agent) contains "steal") or
(lower(http.user_agent) contains "copy") or (lower(http.user_agent) contains
"take") or (lower(http.user_agent) contains "scan") or (lower(http.user_agent)
contains "smart") or (lower(http.user_agent) contains "stealth") or
(lower(http.user_agent) contains "fastify") or (lower(http.user_agent) contains
"bypass") or (lower(http.user_agent) contains "payload") or
(lower(http.user_agent) contains "scrapingbee") or (lower(http.user_agent)
contains "scraping") or (lower(http.user_agent) contains "node.js") or
(lower(http.user_agent) contains "wordpress") or (lower(http.user_agent)
contains "infobot") or (lower(http.user_agent) contains "grapeshotcrawler") or
(lower(http.user_agent) contains "googlebot" and not cf.client.bot)

Visa pateikiama informacija - asmeninė autoriaus nuomonė. Kilus naiškumams rekomenduojama susisiekti elektroniniu paštu: admin@artefaktas.eu

Comments

comments powered by Disqus