Website is being crawled by some kind BOT out of the USA

Hi there,

We noticed since 1 or 2 weeks our Shopify website is getting remarkable high number of visitors out of the USA, via our site recordings. This is noticable as our store hardly get any visitors out of the USA, since we are an EU orientated store. It are like ± 10 visits a day, of some kind of BOT out of the USA, that is scanning our homepage for 10-15 seconds, and the screen display looks very weird (rectangular screen display). It’s only visiting the homepage without any click actions or whatsoever. We first thought it were the people of an Review app which got their HQ in the USA, since these site visitors started same/similair time. But later on we keep noticing nothing really happend and also in the details of the site recordings it’s mentioning ‘ev-crawler 1.0’ (which doesn’t show when we have normal human site vistors). Also these people from the USA mentioning now that our theme is ‘heavily cached’, so they cannot see if their updates on the review widget (by accesing our theme code) is working properly on real live website.

For us it would be handy to know if this ‘crawling USA BOT’ is coming from external or internal (in the theme code) and from whom? For sake it might be Google SEO crawling our website? We did added a lot of new pages last week in a German language (allthough only the homepage is getting crawled..).

Does anyone got experience with this or can help us out? Also what would be the best thing to do.. maybe some crawler script code/file that helps us ‘disavow this bot’ or maybe we should go back to and older theme code script file (if it’s coming from internally that is)..

We still have an average speed of 60 within Shopify and Google Page Speed gives for mobile a 78 and for desktop a 98, but we want to have this problem solved/tackled, before we go to our Speed Optimalisation expert to clean the caching and stuff (otherwise if related with the crawling our worries are that our site caching becomes full again in no time)

Thank you in advance for your support, it’s appreciated

@Mark1988

Yes you can block unwanted bots through robots txt file which needs to be created and uploaded on your store also the same needs to be submitted to Google for seo purpose usually it is a part of on page SEO

Dear Vinods,

Thanks for your reply. Yes we know we can block things. But would be first good to know or find out what or whom is crawling our website. Is it something bad or not? And if bad, we must know what it is (source), otherwise we cannot insert te right prompt (scripture code to robots txt file) for blocking this crawling.

Do you have any clue how we can find out (or someone else that can help) ?

hi @Mark1988

Yes in analytics we can come to know the source of it i would need read only access of analytics account so i can check the source of it.

Dear Vinods, thanks for your reply. Can you show/tell us where we can see that in our Google Analytics account? We are logged in and connected with our website. We also see some visitors out of the USA, all it’s mentioning 3 visitors in total, but we get 10-15 crawling visits per day for 1 or 2 weeks now.

hi @Mark1988

check source / medium of all users of that particular dates .

Can you share a screenshot where to find, or the commands from homepage/homemenu to go where? Because i cannot find it.

Also have checked, and see some USA visitors, but it seems this crawl bot is not being registrated, allthough i see it clearly on Hotjar site recordings, and the user id is different every time as well..

you need to go to GA4 account of your website

then go to

Acquisition >> traffic Acquisition

then see source or medium

Okay, we found the screen, and what we need to look for? To block with txt.robots?

We see:

  1. Unassigned

  2. Direct

  3. Organic search

  4. Paid Shopping

  5. Organic social

  6. Paid search

  7. Referall

It must be (if it’s tracked) under 1, 2 or 3. But how we go from here? Since we still cannot see the actual source of the BOT out of the USA that is crawling our website…

you need to check in

referral

direct

unassigned

if you see the plus sign in first column click there and see source / medium from there you can track or if you have GA UA account there also you can check if you are not use to GA4 account

I checked there with the arrow and did various selections. Besides not finding actual user ID, with country selections i don’t see any visits being registrated out of the USA, allthought the recording program App ‘Hotjar’ records 10-15 op these out of the USA, on a desktop always on a pixel screen with 5835 x 1000 mentioning ‘ev-crawler 1.0’

Since it’s apparently not coming from outside or not being registrated, would it be handy to try to publish and earlier Theme code ‘before’ this review App Support team out of the USA accessed our Theme code?

By the way, there HQ is in the USA, but there IT tech support noticed that they are coming from the Phillipines in the site recordings, and there site sites are being in fact tracked by GA, allthough they are only a few in total, and not reflecting the amount of 10-15visits per day.

Not sure if it will do anything, but maybe it’s worth to try…? Or do you have any other suggestions?

Am there, can select everything country, pixel screen, etc. But none is matching this ‘crawling’ visits out of the USA with pixels 5835 x 1000.

Best i come to is these two, but it’s not clickable or matching the site recordings via Hotjar…

Direct (direct) / (none)

Organic Search google / organic

if you can let me know what time that visit was coming and can share you google analytics access as ready only access it would be much better honestly

In Hotjar site recordings have saved the past few days the recordings, in looks like below. To which mail/user ID need to invite? Your account name at Shopify.com ? Pls let me know, then i can invite you as Read only

20 recordings

Highlights

Relevance

Date

User

Country

Action #

Pages #

Duration

Landing page

Exit page

Replay

  • Very low 03 Aug, 11:48
    c9faec1a (new)
    United States
    1
    1
    0:14
    /
    /

Replay

  • Very low 03 Aug, 08:32
    a511f68c (new)
    United States
    1
    1
    0:15
    /
    /

Replay

  • Very low 03 Aug, 05:28
    d053f72a (new)
    United States
    1
    1
    0:14
    /
    /

Replay

  • Very low 03 Aug, 02:26
    7cad0d57 (new)
    United States
    1
    1
    0:12
    /
    /

Replay

  • Very low 02 Aug, 22:59
    56c36835 (new)
    United States
    1
    1
    0:46
    /
    /

Replay

  • Very low 02 Aug, 18:38
    a01a8367 (new)
    United States
    1
    1
    0:20
    /
    /

Replay

  • Very low 02 Aug, 14:25
    4186b618 (new)
    United States
    1
    1
    0:15
    /
    /

Replay

  • Very low 02 Aug, 06:12
    905aba41 (new)
    United States
    1
    1
    0:16
    /
    /

Replay

  • Very low 02 Aug, 02:06
    6e4e972b (new)
    United States
    1
    1
    0:16
    /
    /

Replay

  • Very low 01 Aug, 21:52
    a7868b06 (new)
    United States
    1
    1
    0:18
    /
    /

Replay

  • Very low 01 Aug, 17:33
    8bad93df (new)
    United States
    1
    1
    0:15
    /
    /

Replay

  • Very low 01 Aug, 13:31
    8ad8f100 (new)
    United States
    1
    1
    0:14
    /
    /

Replay

  • Very low 01 Aug, 10:27
    b3d3816d (new)
    United States
    1
    1
    0:12
    /
    /

Replay

  • Very low 01 Aug, 07:19
    2bec98c7 (new)
    United States
    1
    1
    0:13
    /
    /

Replay

  • Very low 01 Aug, 04:07
    d43b4a39 (new)
    United States
    1
    1
    0:14
    /
    /

Replay

  • Very low 01 Aug, 02:05
    73391bf5 (new)
    United States
    1
    1
    0:02
    /
    /

Replay

  • Very low 01 Aug, 01:04
    fbc6f512 (new)
    United States
    1
    1
    0:15
    /
    /

Replay

  • Very low 31 Jul, 22:06
    83b944c1 (new)
    United States
    1
    1
    0:12
    /
    /

Replay

  • Very low 31 Jul, 19:10
    5e789e37 (new)
    United States
    1
    1
    0:12
    /
    /

Replay

  • Very low 31 Jul, 16:05
    da47b1fc (new)
    United States
    1
    1
    0:13
    /
    /

these hotjar data are mostly home page visits

you can grant me google analytics access not shopify access on
sales [email removed] digitalcentrics .com

but yes only read only access nothing else

and if possible hotjar access to on vinod [email removed] digitalcentrics . com

Many thanks for your reply. Ok i did both, invite for GA and invite for Hotjar

checking hotjar min pls have not yet access GA

FOUND IT
ev-crawler/1.0
this is the CRAWLER from USA
block this in ROBOTS

you can remove access of hotjar and analytics

Yes noticed this as well in the Hotjar recordings.

Do i block it via the Google Search Console

AND

in the internal Script Theme code?

Or what is needed…

Also the PROMPT (script code // url) would be handy what needed to enter :slightly_smiling_face:

block it in robots TXT file