* Posts by Ruairi

43 publicly visible posts • joined 11 Jul 2008

CloudFlare warns of another massive botnet, er, flaring up

Ruairi
Trollface

I wonder if CloudFlail host the front-end for this one also?

Fat-thumbed a BGP entry? Relax, now your pain has a name

Ruairi

So, we should name the leaks instead of actually fixing the broken system which is currently in place? Yupp, it's the IETF at work again!

I'm looking at you, ROA, IRR sets and manual configuration changes. Until we are able to validate a route inside of the BGP process, we will continue to see route leaks.

Lastly: A niggly point, but Telias outage was due to an IS-IS cost, not a BGP leak.

Swedish sysadmins reach for the hex key, reassemble services after weekend DDoS

Ruairi

So there was a post about this from sunet about how DDoS has been reported on. It's a good read and can be found here:

https://www.sunet.se/blogg/showerthoughts-ddosing-an-important-social-institution-and-fixing-it-part1/

Comcast: Google, we'll see your 1Gbps fiber and DOUBLE IT

Ruairi

Re: sarcastic comment...

> 131.6 EB / (30.4375 * 24 * 60 * 60) = ~50.04 TB per second.

There are orgs doing 50Tbit/sec already. Today. Single orgs (They're single digits, but they exist). I'd say that we're already beyond 400Tbit/sec on the global backbone.

Alca-Lu cooks up 400 Gbps router interconnect

Ruairi

Where's the datasheet?

What's optic will it take?

Oversubscription? Density/module?

What spec does it conform to?

Forget WHITE BOX, it's time for JUNK BOX NETWORKING

Ruairi

Well, storage is actually gonna be a bit of a pain on cheaper 10G switches.... and NIC's (Actually - side note, our 10G NICs are probably 3x to 4x the cost of the ports they connect to)

If you're using something older that's got poor buffers (Or poor buffering architecture!), media change and store-and-forward, you're gonna have a bad time with drops in your network, specifically around SAN and storage (Since they can pump out bits quite fast for the cost)

In general, when the network tends towards melt-down the effects are more widespread than a single server going down, which is why I'm always careful in purchasing. An under performing network also is SUPER costly to re-provision (As is the staff to make it happen).

These are the reasons that I'm quite hesitant to go down the route of junkboxing. Even whiteboxing does not make sense at small scale (Since it's back to the engineer cost for setting up/maintaining/developing them exceeds the cost of the network at small scale).

Ruairi

I read the article again, and I still see my points as valid.

Quick maths to show:

40x10G switch @ 1,500 GBP -> 37,5GBP /port.

I'm currently looking at about 100GBP/10G port. Granted it's about 2.5x the price, but it's a) new, b) under warrenty, c) having features actively developed, d) not going to die of old age/degrading soldering/degrading components and e) has support costs included.

Also, my power draw is about 5w/port. We cannot calculate what the actual power draw is - but on older kit, I would wager it's quite higher, which will end up costing more in the long run in a DC (Where power IS a premium).

Lastly, my docs are current, and the amount of engineer hours spent on each device is minimal (I've got many devices in my network I've never had to login to - auto provision, monitoring in place, work flawlessly - the kind of features you dont find on older kit).

So I'd respectfully ask you to re-read my original comment and try follow my logic of breaking down pricing into per-unit.

For sure I'll give you that server/computer related stuff - totally makes sense to do down that route. I'm just not convinced on the networking side.

Ruairi
Pint

Everything comes at a price, and the phrase "pay in peanuts, get monkeys" springs to mind.

Yes, you can get a cheap 10G switch - but what's it buffering like? Feature support? Supportability in production when things go down?

Honestly, it's not helpful to say "A switch for X currency" - we should be talking about price/port and watts/port. Both will give you a better measure of what you're actually looking at. Also, if the port is routed or switched (A routed port will probably be 20x more the cost of a switched port)

I guess it comes down to how often you expect the network to go down, or performance expectations (And if you're running big data or not, as if you are, good luck with cheap switches). These days, I'm finding that expectations for uptime on the network layer is 100% and a corresponding 0% packet loss.

Beer, because that's what cheap switches drive me to.

Cisco patches three-year-old remote code-execution hole

Ruairi

To be fair, you dont leave your management interfaces exposed to the public at large when you're talking about networking devices.

*Even* if you're running 10-15 year old kit that does not support SSH, it's in it's own private/management VLAN that's heavily firewalled/no access to the outside world.

iCloud fiasco: 100 FAMOUS WOMEN exposed NUDE online

Ruairi

Re: "Knowing these photos were deleted a long time ago"

By "cloud" I assume you mean CDN, and if that's the case - there's two operations a CDN needs to do to be considered a CDN - load content onto the CDN, and invalidate assets at the edge.

Same goes for "cloud" - reclaiming freed up storage actually makes financial sense at scale.

Report: Sprint to bring Sony Xperia into tough US smartphone market

Ruairi

Re: Xperias are great

Also a huge fan of the xperia lines. I've had an activ, ZR and Z1C in the last 2 years, extremely happy with them all (Apart from some annoying teething issues/bugs in 1st/2nd firmware release).

The quality of low light pics from the Z1C is astounding. Battery life, well, I get to make fun of the poor iPhone users who get maybe a day.

Form factor of the Z1C is basically perfect for me. It's just a shame that the glass back was repalced w/ some plastic stuff last minute, which tends to scratch all too easily.

Yet another WordPress vuln: Image furtler plugin lets BADNESS in

Ruairi

Yet another WordPress vuln: Image furtler plugin lets BADNESS in

<pedantic>

Dont you mean "Yet another WordPress plugin vuln: Image furtler plugin lets BADNESS in"

</pedantic>

Cisco CEO Chambers: 'Infrastructure is commodity'

Ruairi

Will ACI and NFV still be "long term" goals when they re-invent the next big thing in 12 months?

This company is going nowhere new fast.

Cisco reboots PC with $1500 'Scandafornian' Android fondleslab

Ruairi
Facepalm

Cisco: proving time and time again that the future of the internet is in consumer grade junk.

Little pink handjob: Sony's Xperia Z1 Compact

Ruairi

Have one for about a month now. The 4.2.2 update was pushed to me last week, so you might want to update the article.

I'm upgrading from the ZR, which I absolutely love. The compact is a really awesome phone. I get 3 days with heavy usage (WiFi,4G,GPS). Early OS revs were a bit fruity, but it seems that most issues are addressed now. The camera is really impressive, and the AR mode is hilariously awesome.

The only negative things I can say about the phone is that the lack of option to disable vibrate on touch is *annoying* and the plastic back scratches all too easily.

Cisco belches forth mighty intergobblenator CLOUD OF DOLLARS

Ruairi
Joke

Ah, yes, this makes perfect sense as every major cloud already runs on Cisco equipment, and Cisco have been extremely involved in the Cloud community and listening to the major players when it comes to features and direction for their hardware.

Turkey's farcical Twitter ban leads to SPIKE in tweets

Ruairi

I'm surprised at this for a number of reasons.

Any single competent network engineer would just drop AS13414 at the border, and be done with it (And AS35995/AS54888, since Twitter have multiple AS's). However that's not the case... So I can only assume the technical advisor to the government either didnt want to have a functional ban in place, or was not competent enough to advise correctly.

I'm also surprised that Twitter dont start the whack-a-mole game of spinning up EC2 instances as light weight reverse proxies for their service, then GeoDNS to target Turkish users.

Cisco HAUNTED by $655m memory components snag

Ruairi

Hang on, seems like you guys missed out the real fun in this story

Q: How and when did Cisco find out about this issue?

A: Cisco first became aware of this issue in December 2010.

And later:

Only in late 2012 did field failures and supplier review data point to a potential customer impact

So it took an engineering company 2 years to figure out it was bad memory, and the same time frame to admit to customers they were exploring an issue (They denied all knowledge until this PR stunt was ready to be pushed out the door).

I dont know about you, but that seems like a very long time, a significant portion of the shelf life of most of these products.

I want SDN and I want it now!

Ruairi
Pint

Re: Right on.

TaabuTheCat is indeed correct in this - one primary concern is that the blast radius (regardless of your specific architecture) is fundamentally larger.

Another point I neglected to make is that this brings us way back in terms of network stability as a whole. It'll be like the late 90's again in terms of OSPF/ISIS - running fresh builds, having outages because the implementations are just not mature. You can argue about architecture all you want, but the fact is that you just cannot afford outages. In some cases, it's better to solve the problem with existing protocols, rather than throw everything out and start again (However I would argue that _some_ protocols should be thrown out by default).

The fact is that SDN is only built for scale, and nobody running at that scale can afford _any_ downtime. I actually fully support most of the key arguments behind SDN, it's just that some of the principles seem to come directly from the VM world, and wont have a 1:1 translation to the networking world. I am a huge believer in automation, in partitioning services. However I'm also a believer in correctly architecting the network to your users requirements, and automating deploy time.

Lastly - it's not so much FUD as paranoia and skepticism, due to watching what sales promise go up in smoke with a frequency that's left me bitter right through to my core.

Ruairi
Pint

So we have finally started to convince the world that large layer 2 domains are a very bad idea (think: blast radius), and that lots of individual devices working together (Distributed) is much better than centralising everything...

Until along came SDN and re-centralises everything once again (Controller wise). I can already see the mega-outages a bug at the controller level will cause, or the lack of individual node optimisation this will cause.

It's not that I think SDN is a bad idea, I just think it's half baked right now. I also think that simplicity in design, and a good provisioning tool and excellent engineers will trump the cost of SDN, in terms of man hours, resources and impact.

Beer; because this is what SDN drives me to.

Netflix speed index shows further decline in Verizon quality

Ruairi

Can any of you tell malice from incompetence? Why assume malevolence when incompetence is the more likely answer.

This saga is non-trivial, and it's got many moving points, namely:

1. Netflix's peering policy ( https://signup.netflix.com/openconnect/guidelines , min 2Gbit _each way_ at 95th percentile).

2. Netflix's Open Connect program - ISP's dont want to lose rack space + power to these boxes.

3. ISP's want to make the content providers pay for content traversing their network.

What's more than likely happening is that Netflix traffic is taking the congested path (They are a victim of their own success) inside of Verizon's network. I dont think there is malevolence involved here, however at the same time, there's nothing to be gained right now for Verizon. They're not losing customers, and _if_ they do lose customers over this Netflix saga, it's the customers who tend to cost them more in transit bills.

Don't be a DDoS dummy: Patch your NTP servers, plead infosec bods

Ruairi

BCP38 people - ask for it from your upstream provider.

Also, for any of you running any networks - drop NTP on your border, drop ingress traffic with a source address of your address space at the border. Drop egress traffic that does not match your address space at your border.

Amazon, Facebook, Google give Cisco's switches the COLD shoulder

Ruairi

Been saying it for years - Cisco is going to lose the switching market, and probably the routing market after that.

I see them being relevant in corporate IT (Voice, user access) and servers (I'm guessing that their switching market share will collapse into the embedded switches they're building into UCS chassis).

I'm not 100% sure that home grown devices are mature enough right now for the regular DC. However from an OS perspective, Cumulus Networks should change this in the coming years, and hopefully a larger install base of Trident I / II based boxes will also level the playing field.

Interesting times ahead, and not even one mention of SDN (d'oh!)

What's wrong with network monitoring tools? Where do I start...

Ruairi

What I'd really like to see is event correlation in an intelligent way...

By parsing flow data in almost-real time, looking for patterns in syslog, interface changes (ie: flap, or an interface counter going +/- X% across samples), and snarfing up accounting data. Hell, even take an iBGP feed of updates from my eBGP peers, and a feed of OSPF LSA's and correlate an event with a specific set of updates. There's so much room for correlation, there's just nothing about that I've found that works for me.

I think overall, we have all the tools we need to do this, but the time needed to integrate them all, make them talk nicely, and set intelligent thresholds, relative thresholds and even a little historical predication based on previous events is just not worth it. Lately, instead of spending time on this, I'm fighting to get nfsen/observium/smokeping/homegrown scripts to talk together and give me a coherent view of my traffic patterns.

I'm battling with the stupidity of SNMP traps, SNMP's format and the absurdity of 5 minute samples when I have 40Gbit interfaces.

Monitoring makes me stabby.

10 Types of IT managers from hell

Ruairi
Thumb Up

Where were your articles when I was in my 20's? I've encountered every type of boss listed. Actually, sometimes, multiple types in the same person.

Excellent article, as always. Look forward to the next.

WTF is … Routing Protocol for Low-Power and Lossy Networks?

Ruairi

Why do people need to invent the wheel. The B.A.T.M.A.N protocol exists to solve this exact problem in IPv4, and can be easily ported to IPv6.

Nothing new here....

EU chucks €18m at research for stupidly fast networks

Ruairi

> Our Future Internet should know no barriers, least of all barriers created because we did not prepare for the data revolution.

Sorry, darlin', speed of light is a constant.

Wireless boffins boost Wi-Fi hotspot performance 700%

Ruairi

@AC:

In this case, it's a problem with how WiFi is setup, rather than TCP. WiFi is a shared medium, so you're going to get collisions (CSMA/CA attempts to give fair access to the medium). TCP is affected as it sees a collision as a drop, so it scales back throughput wise. That's why they're dropping new sessions, and giving priority to the existing data flows (It's kinda cheating, throughput wise).

TCP has quite a few throughput hacks in it (Window Scaling, SACK, binary backoff Vs exponential etc), and is quite predictable and mature. The real issue here is wireless ethernet being "non switched", thus having collisions and packet loss with many users.

Ruairi

So it's adaptive, intelligent tail dropping? I dont see what's earth shattering about this.

Wales: Let's ban Gibraltar-crazy Wikipedians for 5 years

Ruairi
FAIL

It's like a bloody soap opera over there on wikipedia.

Cisco EOLs ACE load balancer platform

Ruairi
Stop

Re: surprised

Oh, but they did.

Arrowpoint created the CSS platform. CSM was homegrown, ACE was homegrown.

I'm not surprised about the ACE at all, it's not exactly the best platform out there.

Facebook reveals next-gen Open Compute wares

Ruairi
Flame

Hi..

We're Facebook. We are trying to get people to take us seriously with our super-dooper server designs and by showing all you guys that we live in the past.

Opera extends Speed Dial with Swordfish beta

Ruairi
Angel

How long...

...before FF steal this feature?

Facebook spam prevention scam spreading like wildfire

Ruairi
Dead Vulture

Well

I suppose this is the end of facebook as we know it.

Jobs woos devs with iPhone OS iOS 4

Ruairi

Not again

Didnt Apple make the same mistake with Cisco with the iPhone as they're making now with IOS?

Feds seize $143m worth of bogus networking gear

Ruairi
WTF?

Ummmm....

" In 2008, he attempted to traffic 100 gigabit interface converters that were bought in China"...

100Gbit? If this is not a typo, which dumb ass actually bought this and considered it legit (in 2008)?

Irish civil rights group takes aim at iPad launch

Ruairi
Joke

WTF!

Cad e sin? An bhfuil ciall ar bith ar an fhear seo?

Cisco shareholders win say-on-pay

Ruairi
Unhappy

Yeah, right

""through regular discussions with management" and the compensation committee of the board of directors."

Haha, yeah. We all know how effective committees are.

What do Scotland, Australia and Africa have in common?

Ruairi

RE: Unified communications. Pfaw

>> The reality is that no matter how many lines of communication you offer a given employee, you can't change the fundamental fact that it is their CHOICE to respond or not.

Agree, however you have presence awareness. At least now the boss knows the employee is unable to respond.

Swedish bloke attempts lactation

Ruairi
Coat

Talk about...

... Milking his media coverage.

I'll get me coat.

Microsoft kicks Ubuntu update in the hardy herons

Ruairi

@Bruce Sinton

Nope, one machine can never live up to that level of service :)

But, say you have a farm of 10 machines behind a load balancer (in fault tolerant mode), with the correct type of probing configured on the load balancers (so that machines which still ping, but say, dont serve out HTTP, are not used for a load balancing decision), 99.999% should be an attainable goal.

You hardly think that MS or Apple use just one machine for these types of services?

Ruairi
Paris Hilton

@Sean Keeney

Agreed!

I'd say their setup is a little more complicated than that, I'd suspect that they have GSLB in place, to direct you to the nearest (geographically) serverfarm.

Anything less than 99.999% these days is just not acceptable from a corproate entity.

Paris, cause she's had more downtime than Ubuntu's site..

Levi's suffers profit meltdown in midst of SAP embrace

Ruairi
Coat

This story is..

..absolute pants.

Err, mine's the denim one