]> Normalising line feeds 🌐:aligrant.com

Normalising line feeds

Alastair Grant | Friday 10 June 2016

I hit a little problem where something was processing strings and normalising linefeeds to Windows style carriage-return line-feed (\r\n). This was fine, but I was previously trimming the string to a specific byte length - any single-byte line feeds were being converted to double byte newlines.

I figured the safe bet was to do the conversion before hand and then trim, but normalising newlines is a little tricky. You start off by doing things like a standard string replace against "\r" or maybe "\n". But that spectacularly doesn't work when the original entry has newlines or "\r\n" and you suddenly land up with something like "\r\r\n" and all goes to pot.

We all know the answer, regular expressions - but that can be a very painful world. Still, I think I've got a working pattern that I'll put here for future reference. It matches any occurrence of \r unless it's immediately followed by \n OR any occurrence of \n unless it's immediately preceded by \r. Useful for converting everything to Windows style newlines (\r\n).

\r(?!\n)|(?<!\r)\n
Breaking from the voyeuristic norms of the Internet, any comments can be made in private by contacting me.