There is no way Facebook outage was a mis-configuration error - The Great Awakening

There is no way Facebook outage was a mis-configuration error 🏳️ FALSE FLAG 👮

posted 3 years ago by JonathanE 3 years ago by JonathanE +354 / -0

In order for Facebook IP ranges to be blackholed by a 'configuration error' the following conditions would have to be met/breached.

Background

A major network like the one Facebook has will peer with other network providers at multiple locations in multiple DC's
Large networks like this will have out of band (OOB) access to border routers (think modem via a mobile phone link)
Large companies with critical assets have very strict change control procedures

Change control is a key area to understand with regards to this outage. Most people might not be aware, but it is often a quite involved process that incorporates a number of checks and failsafes.

Consider a 'normal' change for a large company.

Someone in the business decides that a change needs to be made for whatever reason, so they raise a change request.
This initial change request will often be high level, and a business approver will have to sign it off as required and worth the risk
Technical engineers will create a more detailed change request, this will include such things as

Details of what is going to be changed and why, and on what equipment
Identify if the change poses a risk to critical services
A detailed change script of what actual changes will be performed, including a backout plan should the change fail
Key services will be identified before hand and a test plan put in place to occur prior to the change (to make sure it's all ok) and a re-run of the tests after the change has been made.

Another engineer will peer review the detailed change request and provide technical approval, or push back on things that might be wrong or unclear. They also provide assurance that the technical changes being made will actually achieve the stated business goals
Once all this is done the change will go to a final approval team who have a 'big picture' view and can juggle changes between various Data Centers and ensure there are no overlaps between different change requests that could cause unexpected issues.

Once all this is approved the change will be scheduled for out of hours change, depending on which timezone the relevant DC is in.

It is highly unlikely changes of this nature would be made at all of Facebooks datacenters at the same time..

https://www.datacenters.com/facebook-data-center-locations

And certainly not within business hours.

Even assuming all this was done, the person conducting the change would be using the out of band connectivity to perform it. This is done so that if you make a mistake or there is an undocumented bug in the IOS code (it happens) then they are not kicked off the device and can still remediate the problem.

The very fact that engineers could not get into the building to fix the problem once it started means that there was no-one actually making the change.

The above procedure is typical for a large enterprise with public facing critical assets - Facebook's policies are likely to be even tighter.

TL;DR - In order for this 'mis-configuration' to be a thing, all of the checks would have to have missed the potential issue, the change would have to be implemented simulataneously in all of Facebooks datacenters where they peer with other internet providers - and all by some kind of automated system where no human had any oversight during normal business hours.

It simply isn't feasable.

UPDATE: Here is more information on the 'official' version..

https://www.theregister.com/2021/10/06/facebook_outage_explained_in_detail/

77 comments

77 comments share save hide report block hide replies

To The Great Awakening

We are researchers who deal in open-source information, reasoned argument, and dank memes. We do battle in the sphere of ideas and ideas only. We neither need nor condone the use of force in our work here. WE ARE THE PUBLIC FACE OF Q. OUR MISSION IS TO RED-PILL NORMIES.

This is a pro-Q community. Please read and respect our rules below before contributing.

WHY Q?

"Those who cannot understand that we cannot simply start arresting w/o first: ensuring the safety & well-being of the population shifting the narrative removing those in DC through resignation to ensure success defeating ISIS/MS13 to prevent fail-safes freezing assets to remove network-to-network abilities kill off COC to prevent top-down comms/org, etc. etc. should not be participating in discussions." Q

Comments (77)

sorted by:

▲ 4 ▼

– JustSayIt 4 points 3 years ago +4 / -0

Cloudflare broke down what they observed happen in this blog post they made while Facebook was still offline.

Interesting to note that IP blocks for Facebook's DNS servers which were withdrawn were routed to US data centers. Apparently Facebook hosts their DNS servers on AWS instead of in their own data centers, so the routes would have been pointing to Amazon (unconfirmed).

permalink save report block reply

▲ 1 ▼

– scyenceFiction 1 point 3 years ago +1 / -0

Small point that makes all the difference: routes were not withdrawn. The entire AS disappeared.
The first is a normal consequence of a route or router failure. The second is never normal.

permalink parent save report block reply

▲ 0 ▼

– JustSayIt 0 points 3 years ago +1 / -1

Are you sure the entire AS disappeared? That Cloudflare blog post shows that other IP blocks on AS32934 were still routed at the time.

I also did see a timelapse of many of their BGP routes being withdrawn over time. It's possible that Cloudflare saw those blocks mentioned in the post as still routed at the time and they too went down... But if the entire AS disappeared, why didn't all the routes go down at the same time?