August 5, 2018•448 words
tl;dr if you're getting this error, it's an automated threshold that Microsoft Support has to reset or expire over time
check your message trace logs for signs of abuse and go ahead and call support
I'm currently working on an ~800 user hybrid Exchange deployment and due to some issues with the existing 2010 environment decided to deploy an Exchange 2016 server to handle hybrid duties. Becauuse the customer is switching from Mimecast inbound/outbound to EOP/ATP I decided to route all messages from on-prem out through Office 365 immediately rather than just letting on-prem route directly and Office 365 route directly. Centralized routing reversed, basically.
Everything worked fine for 24 hours, but then I got frantic calls from the customer that they couldn't send or receive e-mail. Looking at the logs I saw rejected messages with the error code:
5.7.700-749 Access denied, tenant has exceeded threshold
I've done a ton of these deployments and never seen this error, but common sense told me that it was related to some sort of abuse prevention. I started running message traces and saw no indication that anyone had been phished. I double checked my connectors to make sure I hadn't accidentally created some sort of open relay on-prem that was abusing EOP as a smarthost. I couldn't find anything.
I called Microsoft and thankfully got a helpful engineer on the first try (not the norm, unfortunately). He immediately determined because we had gone from zero messages to over 1000 unique recipients in a day that we had triggered an automated abuse threshold. He confirmed that suspicion via health checks on his side and used an internal script to reset the threshold, which restored the mail flow within an hour or two.
The bad thing about this is you're entirely at Microsoft's mercy. I was able to restore mail flow to the users on-premise outbound by creating a new send connector, but the mailboxes on Office 365 were not able to send or receive, and even some inbound traffic to on-prem users was being rejected by EOP. If Microsoft had replied they were going to wait 24 hours to fix the issue I was prepared to offboard the few mailboxes we had already migrated back to on-prem, and even cut MX back over until we could figure out what was happening. Thankfully all that was avoided.
So, pro-tip, it may be better to slowly scale up your outbound traffic vs going from zero to 1000 unique recipients in a day, or perhaps to alert Microsoft if you're planning to do it that way. And kudos to Microsoft for having Tier 1 support tools capable of fixing what ultimately was a simple false-positive.