Sound Code: 2013

Tuesday 31 December 2013

2013 in Review

As this year draws to a close its time to reflect on what I’ve been up to in terms of development in the last year, and think about what my plans are for the coming year. As they say, if you fail to plan, you plan to fail. I certainly haven’t accomplished everything I wanted to in 2013, but looking back I did get a fair bit done. One regret is that I didn’t blog as much as usual, although there are a whole load of half-finished draft posts, that I really should get round to completing.

Pluralsight

Probably the biggest thing I’ve accomplished this year has been creating two courses for Pluralsight, Digital Audio Fundamentals, and Audio Programming with NAudio. It was a huge amount of work, but I enjoyed making them, and I have a couple of future courses in the pipeline with them. They’ve been a great company to work with, and if you haven’t checked out their library of courses, it’s well worth doing so. It’s growing very rapidly and great for getting you quickly up to speed on new technologies.

Open Source

I try to get at least one release out of NAudio per year, and managed that with the October release of NAudio 1.7. I also made an appearance talking about audio on .NET Rocks. Although its been going for over 12 years now, I’ve still got plenty of ideas for what to put into future versions of NAudio. The biggest challenge I face with NAudio is the time required to answer all the questions I get asked. Apologies if your question didn’t get a reply.

I had also plans to add some cool new features to my Skype Voice Changer application, which has been astonishingly successful over the years (downloads in the millions!). But Microsoft pulled the rug from under my feet with the announcement that the Skype API is being retired. My intention is for NAudio 1.8 to feature more built-in audio effects, and better examples of using them. Another plan for next year is to do a talk at my local usergroup showing how you can make a synthesizer and maybe some guitar effects with NAudio. If that goes well I might take the talk round some other user groups or conferences too.

The Day Job

I’ve been at my current company NICE for 9 years now, my longest stretch in a single job by some margin now. And for most of that time I’ve been working on a single product. It’s been this project that has continued to drive my interest in the challenges of maintaining very large codebases, eliminating technical debt, and evolving architecture to meet new challenges. This next year I’m hoping to continue to blog more about best practices for working with legacy code.

Learning

I love learning new programming techniques and technologies, and there’s much to keep me busy this year. I’ve dipped my toe into F# and Typescript, and I’ve built my first proper website running on Azure with ASP.NET MVC and JQuery. More recently, I’ve been learning Angular, and will hopefully find the time to blog about some of the things I’ve learned through that.

In 2014, F# and Javascript frameworks such as Angular will probably continue to be the new technologies I focus most on. I have a few simple website ideas that I’d like to build, which should also help speed up my learning of web technologies.

Anyway, happy new year to all my readers, and I hope that 2014 is a productive and rewarding year for you.

Monday 25 November 2013

Three Surprising Causes of Technical Debt

I’ve been thinking a lot about technical debt over recent years. The codebase I work on has grown steadily to around 1 million lines of code (that’s counting whitespace and comments – there’s ‘only’ about 400K actual statements), and so the challenge of maintaining control over technical debt is one I think about on a daily basis.

What I’ve also noticed is an ever-increasing number blogs, books, conference sessions and user group talks tackling this subject. It seems lots of us are facing the same issue. But why is it that so many of us are we getting ourselves into such difficulties? What’s causing all this technical debt? I’ve got some ideas, which you might find a little surprising…

1. Computers are too Powerful

In my first programming job I wrote a C application to run on a handheld barcode scanner. My development machine was a green-screen 286 laptop, which I dubbed “the helicopter”, due to the racket its fan made. The application consisted of about 20 text-based screens, some “database” management, and some serial port communications code. If I made a change that required a full rebuild, I’d go for a walk, because it would take my development machine a couple of hours. That application had 9,423 lines of code (4,465 statements). I couldn’t use that machine to build a one million line system even if I wanted to. The build would take three years.

Compare that with the development machine I use today. It can do a clean build of my million line codebase in under 30 minutes including running the unit tests. An incremental build takes a couple of minutes. I can have the client solution (90 sub-projects) and the server solution (50 sub-projects) both running under the Visual Studio debugger, with CPU and RAM to spare. All one million lines of code are in a single source code repository, and our version control system barely breaks a sweat.

What’s more the end users’ computers have more than enough power. Our client and server can run quite happily on low-end hardware. We could probably add another million lines of code before we ran into serious memory usage issues.

But why is this a problem? Hasn’t Moore’s law been absolutely brilliant for the software development world? Well yes and no. It allows us to do things that were simply beyond the realms of possibility only a decade ago. We can create bigger and more powerful systems than ever before. And that’s exactly what we’re doing.

But imagine for a moment if Visual Studio only allowed a maximum of 10 projects in a solution. Imagine your version control system could cope with a maximum of 10000 lines of code. Imagine your compiler supported a maximum of 100 input files. What would you do?

What you’d do is modularise. You’d be forced to use vertical layering as well as just horizontal layering. You’d seriously consider a micro-service architecture. In short you wouldn’t be able to create a monolith. You’d have to create lightweight loosely coupled modules instead.

The bigger a codebase is, the more technical debt it contains. And the power of modern computers makes it far too easy to create gigantic monolithic codebases.

I’m not complaining that we need less powerful computers. We just need the self-discipline to know when to stop adding features to a single component. Because the computer isn’t going to stop us until we’ve gone way too far.

2. Frameworks are too Powerful

For me, it started with VB6, and .NET took it to a whole new level. It was called “RAD” at the time – “rapid application development”. Things that would take weeks with the previous generation of technologies could be done in days. In just a few lines of code you could express what used to take hundreds of lines.

Now in theory, this is good for technical debt. Fewer lines of code means more maintainable code right? Well yes and no.

The rise of all-encompassing frameworks like the .NET FCL, and all the myriad open source libraries that augment it, mean that our code is much denser than before. In 20 lines of code using modern languages and frameworks, we can do far more than we could achieve using 20 lines of 1990 code. This means that a modern 10KLOC codebase will have many more features than a 10KLOC codebase from 20 years ago.

So not only has the growth in computing power meant we can write more lines of code than were possible before, but the growth in framework capabilities has meant we can achieve more line for line than ever before.

How is this a problem? If a feature that would take 1000 lines of code to implement without a powerful framework can be implemented in 100 using a framework, doesn’t that make the code more maintainable? The trouble is that we can’t properly reason about code that uses a framework if we know nothing about that framework.

So if I need to debug a database issue and the code uses Entity Framework Code First, I’ll need to learn about Entity Framework to understand what is going on. And in a one million line codebase, there will be quite a lot of frameworks, libraries and technologies that I find myself needing to understand.

It’s not uncommon to find several competing frameworks co-exist within the same monolithic codebase, so there’s Entity Framework, but there’s also NHibernate, and Linq2Sql and some raw ADO.NET and maybe some Simple.Data too. There’s XML serialization, Binary serialization and JSON serialization. There’s a WCF service, a .NET remoting service, and a WebAPI service. There might be some custom frameworks that were created specifically for this codebase, but their documentation is poor or non-existent.

The bigger the codebase, the more frameworks and libraries you will need to learn in order to make sense of what’s going on. And unless you understand these frameworks, it will seem like things are happening by magic. You know the system’s doing it, but you have no idea where to put the breakpoint to debug it. And you have no idea which line of code you need to modify to fix it.

I’m certainly not complaining that we need to stop using frameworks or open source libraries. I’m not even suggesting that we need to standardise on one single framework and enforce it right through the codebase. What I am suggesting once again is that if our large systems were composed out of much smaller pieces, rather than being monoliths, the number of frameworks you’d need to learn in order to understand the code would be greatly reduced. And if you didn’t like their choice of framework, you’d have the option to rewrite that component with a better one.

3. Programming is too Easy

My final reason why we are finding it so easy to create mountains of technical debt will perhaps be the most controversial of the three. It was inspired by Uncle Bob Martin’s recent article “Hordes of Novices”. In his article he laments the fact that the software industry is flooded with novice developers writing bad code. We’d get more done, he argues, with fewer programmers writing better code. And I agree with him. He questions whether we do in fact really need more code to be written…

Or do we want less code? Less code that does more. Much less code, written much better, doing much, much more?

But why is the software industry letting novices loose on their code? Because programming is too easy. Give a junior developer a set of clear requirements, and eventually they’ll produce some code that (sort of) does what was requested of them. They might take longer than the senior developer, but they will finish.

And so from management’s perspective, adding more programmers appeared to work. We sneaked in a few extra features that the senior developers didn’t have time for. The junior developers work slower, but they’re paid less, so it all balances out in the end.

But what we know is that there is more difference between the novice and senior programmers’ code than just how long it took to write. For starters, the novice programmer will create more lines of code, often by factor of 10 or more. So maintainability will be an issue. Also, the novice programmer will typically have more bugs in their code. For sure, the testers will have found the obvious ones. But I’m talking about time-bomb bugs, accidents waiting to happen, like storing the date-time in local time rather than UTC. When the clocks go back and the system falls over, it will be the senior developers sorting out the mess.

And that’s not to mention areas like performance and error handling, which novices typically do a poor job at. Or simply the way they chose to implement the feature. Very often a novice will make code changes in all the wrong places, with cut and paste coding instead of isolating the change, or injecting special knowledge into parts of the system that are supposed to be generic. Their unmaintainable code sprawls right across the system, making it extremely difficult to undo the mess.

Of all my three reasons for technical debt, this was the one I least wanted to write about, because it seems mean-spirited and ego-centric. “I’m the senior developer and I my code is flawless – it’s you novices that are messing things up and causing the technical debt problem”. But the blame game doesn’t get us anywhere. They may be novices, but usually they are conscientious and professional. They’re doing their best, and learning as they go.

In any case, what would the solution be? Hire no novices? Where will the next generation of master craftsmen come from? Once again I think the answer comes in the form of composing a large system out of much smaller loosely-coupled and easily replaceable components. If you must be quick to market, by all means let a junior developer make a small (1000 line of code) module. But if it turns out to be buggy or unmaintainable, then let the senior developer delete the whole thing and code it again from scratch. The novices who progress to the next level are the ones whose components don’t need to be thrown away.

The bigger the codebase, the more code it will contain which was written in sub-optimal way, often by novices. But in a big codebase, getting rid of bad code is extremely difficult. So if you must let novices write code, isolate their work in small replaceable modules, and don’t be afraid to throw them away at the first sign of trouble.

In fact, this applies equally to senior developers. If everyone is creating small replaceable components, technical debt can be tackled much more effectively.

Conclusion

The reason so many of us are deep in technical debt is it’s become far too easy to rack up a huge amount of debt before you realise you’re in trouble. Our computers are so powerful they don’t warn us that we’re trying to do too much. New frameworks allow us to write succinct and powerful code, but each one adds yet one more item to the list of things you must understand in order to work on the codebase. And you can hire a dozen novice programmers and two months later add a dozen shiny new features to your codebase. But the mess that is left behind will probably never be fully cleaned up.

What’s the solution? Well I’ve said it three times already, so it won’t hurt to say it again. The more I think about technical debt, the more convinced I become that the solution is to compose large systems out of small, loosely-coupled and replaceable parts. Refactoring is one way to eliminate technical debt, but it is often too slow, and barely covers interest payments. Much better to be able to throw code into the bin, and write it again the right way. But with a monolith, that’s simply not possible.

Disagree? Tell me why in the comments…

Saturday 16 November 2013

How to Convert a Mercurial Repository to Git on Windows

There are various guides on the internet to converting a Mercurial repository into a git one, but I found that they tended to assume you had certain things installed that might not be there on a Windows PC. So here’s how I did it, with TortoiseHg installed for Mercurial, and using the version of git that comes with GitHub for Windows. (both hg and git need to be in your path to run the commands shown here).

Step 1 – Clone hg-git

hg-git is a Mercurial extension. This seems to be the official repository. Clone it locally:

hg clone https://bitbucket.org/durin42/hg-git

Step 2 - Add hg-git as an extension

You now need to add hg-git as a mercurial extension. This can either be done by editing the mercurial.ini file that TortoiseHg puts in your user folder, or just enable it for this one repository, by editing (or creating) the hgrc file in the .hg folder and adding the following configuration

[extensions]
hggit = c:/users/mark/code/temp/hg-git/hggit

Step 3 – Create a bare git repo to convert into

You now need a git repository to convert into. If you already have one created on GitHub or BitBucket, then this step is unnecessary. But if you want to push to a local git repository, then it needs to be a “bare” repo, or you’ll get this error: “abort: git remote error: refs/heads/master failed to update”. Here’s how you make a bare git repository:

git init --bare git_repo

Step 4 – Push to Git from Mercurial

Navigate to the mercurial repository you wish to convert. The hg-git documentation says you need to run the following one-time configuration:

hg bookmarks hg

And having done that, you can now push to your git repository, with the following simple command:

hg push path\to\git_repo

You should see that all your commits have been pushed to git, and you can navigate into the bare repository folder and do a git log to make sure it worked.

Step 5 – Get a non-bare git repository

If you followed step 3 and converted into a bare repository, you might want to convert to a regular git repository. Do that by cloning it again:

git clone git_bare_repo git_regular_repo

Thursday 7 November 2013

Thoughts on using String Object Dictionary for DTOs in C#

When you have a large enterprise system, you end up with a very large number of data transfer objects / business entities that need to get persisted into databases, and serialized over various network interfaces. In C#, these DTOs will usually be represented as strongly typed classes. And its not uncommon to have an inheritance hierarchy of DTOs, with many different “types” of a particular entity needing to be supported.

For the most part, using strongly typed DTOs in C# just works, but as a system grows over time, making changes to these objects or introducing new ones can become very painful. Each change will result in database schema updates, and if cross-version serialization and deserialization is required (where a DTO which was serialized in one version of your application needs to be deserialized in another), could potentially break the ability to load in legacy data.

Here’s a rather contrived example, for a system that needs to let the user configure “Storage Devices”. Several different types of storage device need to be supported, and each one has its own unique properties. Here’s some classes that might be created in C# for such a system:

class StorageDevice { 
    public int Id { get; set; } 
    public string Name { get; set; }
}

class NetworkShare : StorageDevice {
    public string Path { get; set; }
    public string LoginName { get; set; }
    public string Password { get; set; }
}

class CloudStorage : StorageDevice {
    public string ServerUri { get; set }
    public string ContainerName { get; set; }
    public int PortNumber { get; set; }
    public Guid ApiKey { get; set; }
}

These types are nice and simple, but already we run into some problems when we want to store them in a relational database. Quite often three tables will be used, one called “StorageDevices” with the ID and Name properties, and then one called “NetworkShares” linking to a storage device ID, and storing the three fields for Network Shares. Then you need to do the same for “CloudStorage”. To add a new type of storage or change an existing one in any way requires a database schema update.

Cross-version serialization is also very fragile with this approach. Your codebase can end up littered with obsolete types and properties just to avoid breaking deserialization.

This type of object hierarchy can introduce a code smell. We may well end up writing code that breaks the Liskov Substitution Principle, where we need to discover what the concrete type of a base “StorageDevice” actually is before we can do anything useful with it. This is exacerbated by the fact that developers cannot move properties from a derived type down into the base class for fear of breaking serialization.

This approach also doesn’t lend itself well to generic extensibility. What if we wanted third parties to be able to extend our application to support new types of StorageDevice, with our code agnostic to what the concrete implementation type actually is? As it stands, those new types would need their own new database table to be stored in, and it would be very hard to write generic code that allowed configuration of those objects.

The String-Object Dictionary

A potential solution to this problem is to replace the entire inheritance hierarchy with a simple string-object dictionary:

class StorageDevice {         
    public IDictionary<string, object> Properties { get; set; }
}

The idea behind this approach is that now we never need to modify this type again. It can store any arbitrary properties, and we don’t need to create derived types. It is basically a poor man’s dynamic object, and in theory in C# you could even just use an ExpandoObject. However, having a proper type opens the door to creating extension methods that simplify getting certain key properties in and out of the dictionary. This can mitigate the biggest weakness of this approach, which is losing type safety on the properties of the object.

These objects are more robust against version changes. You can tell that an object comes from a previous version of your system by what keys are and aren’t present in the dictionary (you could even have a version property if you wanted to be really safe), and do any conversions as appropriate. And you can successfully use objects from future versions of your system so long as the properties you need to work with are present.

Persisting these objects to a database presents something of a challenge, since you’d need to store an arbitrary object into a single field. And that object could itself be a list of objects, or an object of arbitrary complexity. Probably JSON or XML serialization would be a good approach. In many ways, these lend themselves very well to a document database, although for legacy codebases, you may be tied into a relational database. You could still run into deserialization issues if the objects stored as values in the databases were subject to change. So those objects might also need to be string-object dictionaries too.

Other issues you might run into is deciding what to do about additional properties you want to add onto the object but not serialize. Many developers will be tempted to put extra stuff into the dictionary for convenience. One possible option would be to use namespacing on the keys. So anything with a key starting with “temp.” wouldn’t be serialized or persisted to the database for example. I’d be interested to know if anyone else has tackled this problem.

Conclusion

String object dictionaries are a powerful way of avoiding some tricky versioning issues and making your system very extensible. But in many ways they feel like trying to shoehorn a dynamic language approach into a statically typed one. I’ve noticed them cropping up in more and more places though in C# programming, such as the Katana project which uses one for its “environment dictionary”.

I think for one of the very large projects I am working on at the moment, they could be a perfect fit, allowing us to make the system significantly more flexible and extensible than it has been in the past.

But I am actively on the lookout at the moment for any articles for or against using this technique, or any significant open source projects that are taking this approach. So let me know in the comments what you think.

I did ask a question about this on Programmers stack exchange and got the rather predictable (“that’s insane”) replies, but I think there is more mileage in this approach than is immediately apparent. In particular it is the need for generic extensibility without database schema updates, and cross-version deserialization that pushes you in this direction.

Monday 4 November 2013

Announcing “Audio Programming with NAudio”

I’m really excited to announce the release of my latest Pluralsight course “Audio Programming with NAudio”. This is the follow-on course to my introductory “Digital Audio Fundamentals” course, and is intended to give a thorough and systematic run-through of how to use all the major features of NAudio.
The modules are as follows:

Introducing NAudio in which I go through all the things you can find in NAudio, and explain what the demo applications do as well. This module also introduces the three base classes/interfaces for all signal chain components in NAudio (IWaveProvider, WaveStream and ISampleProvider), which are crucial to understanding how to work with NAudio.
Audio File Playback explains how to decide which of NAudio’s playback classes you should use, and how they are configured. I cover lots of important playback related tasks such as how to change volume, reposition, and even how to stop (yes, that can require more thought than you might imagine).
Working with Files deals with the various file readers in NAudio, as well as showing how to create wave files with WaveFileWriter.
Changing Audio Formats might seem longwinded and a bit boring, but is actually one of the most important modules in the course. It explains how to change the sample rate, bit depth and channel count of PCM audio. This is something that you need to do regularly when dealing with sampled audio, and there are lots of different approaches you can take.
Working with Codecs explains how you can use the ACM and MediaFoundation codecs on your computer, and also gives a couple of other techniques for working with codecs that you might find helpful.
Recording Audio shows how to use the various NAudio recording classes, including loopback capture, and a brief demo of high performance low level capture using ASIO.
Visualizations is not really about NAudio, but gives some useful strategies for creating peak meters, waveform displays and spectrum analyzers in both WinForms and WPF. I included it because this is something lots of NAudio users wanted to know how to do.
Mixing and Effects is probably my favourite module in the course. To show how to do mixing and effects I create a software piano, and turn it into a software synthesizer.
Audio Streaming covers another topic that NAudio doesn’t directly address, but that lots of NAudio users are interested in. This module shows how you can implement playback of streaming media, and how you would go about creating a network chat application.

I really hope that everyone planning to create a serious application with NAudio will take the time to watch this course (and the previous one if necessary). You’ll find you will be a lot more productive once you know the basics of digital audio, and understand the design philosophy behind the NAudio library.
I know Pluralsight is not free, but receiving some financial compensation for the time I put into this course enabled me to spend a lot more time on it than I would have been able to otherwise. The Pluralsight library is ridiculously well stocked with courses on all kinds of development technologies and practices, so you can’t fail to get good value for the monthly subscription fee. Having said that, please be assured that I remain committed to provide good training materials on NAudio, and I plan to keep blogging and updating the main site with more documentation.
I have actually reached the point with NAudio that I am getting far more requests for help than I can keep up with. I’m averaging 10 questions/emails per day, and so if I miss even a few days I get hopelessly behind. If I’ve failed to answer your question, please accept my apologies, and part of the goal in creating this course is having something I can point people to. So if I answer your question with a link to this course, please don’t be offended.

Sunday 3 November 2013

Finding Prime Factors in F#

After having seen a few presentations on F# over the last few months, I’ve been looking for some really simple problems to get me started, and this code golf question inspired me to see if I could factorize a number in F#. I started off by doing it in C#, taking the same approach as the answers on the code golf site (which are optimised for code brevity, not performance):

var n = 502978;
for (var i = 2; i <= n; i++)
{
    while (n%i < 1)
    {
        Console.WriteLine(i);
        n /= i;
    }
}

Obviously it would be possible to try to simply port this directly to F#, but it felt wrong to do so, because there are two mutable variables in this approach (i and n). I started wondering if there was a more functional way to do this working entirely with immutable types.

The algorithm we have starts by testing if 2 is a factor of n. If it is, it divides n by 2 and tries again. Once we’ve divided out all the factors of 2, we increment the test number and repeat the process. Eventually we get down to to the final factor when i equals n.

So the F# approach I came up with, uses a recursive function. We pass in the number to be factorised, the potential factor to test (so we start with 2), and an (immutable) list of factors found so far, which starts off as an empty list. Whenever we find a factor, we create a new immutable list with the old factors as a tail, and call ourselves again. This means n doesn’t have to be mutable – we simply pass in n divided by the factor we just found. Here’s the F# code:

let rec f n x a = 
    if x = n then
        x::a
    elif n % x = 0 then 
        f (n/x) x (x::a)
    else
        f n (x+1) a
let factorise n = f n 2 []
let factors = factorise 502978

The main challenge to get this working was figuring out where I needed to put brackets (and remembering not to use commas between arguments). You’ll also notice I created a factorise function, saving me passing in the initial values of 2 and an empty list. This is one of F#’s real strengths, making it easy to combine functions like this.

Obviously there are performance improvements that could be made to this, and I would also like at some point to work out how to make a non-recursive version of this that still uses immutable values. But at least I now have my first working F# “program”, so I’m a little better prepared for the forthcoming F# Tutorial night with Phil Trelford at devsouthcoast.

If you’re an F# guru, feel free to tell me in the comments how I ought to have solved this.

Tuesday 29 October 2013

NAudio 1.7 Release Notes

It’s been just over a year since the last release, so an updated release of NAudio is long overdue and there’s a lot of great new features added over the last year which I’m really excited to make official. In particular, this release adds Media Foundation support, allowing you to play from a much wider variety of audio file types, including extracting audio from video files. There’s also been significant progress on Windows 8 store app support although it still isn’t quite ready for official release.

As usual you can get it via Nuget, or download it from Codeplex. Last year NAudio 1.6 was downloaded 29616 times from Codeplex and 7089 times on nuget.

What’s new in NAudio 1.7?

MediaFoundationReader allows you to play any audio files that Media Foundation can play, which on Windows 7 and above means playback of AAC, MP3, WMA, as well as playing the audio from video files.
MediaFoundationEncoder allows you to easily encode audio using any Media Foundation Encoders installed on your machine. The WPF Demo application shows this in action, allowing you to encode AAC, MP3 and WMA files in Windows 8.
MediaFoundationTransform is a low-level class designed to be inherited from, allowing you to get direct access to Media Foundation Transforms if that’s what you need.
MediaFoundationResampler direct access to the Media Foundation Resampler MFT as an IWaveProvider, with the capability to set the quality level.
NAudio is now built against .NET 3.5. This allows us to make use of language features such as extension methods, LINQ and Action/Func parameters.
You can enumerate Media Foundation Transforms to see what’s installed. The WPF Demo Application shows how to do this.
WasapiCapture supports exclusive mode, and a new WASAPI capture demo has been added to the WPF demo application, allowing you to experiment more easily to see what capture formats your soundcard will support.
A new ToSampleProvider extension method on IWaveProvider now makes it trivially easy to to convert any PCM WaveProvider to an ISampleProvider. There is also another extension method allowing an ISampleProvider to be passed directly into any IWavePlayer implementation without the need for converting back to an IWaveProvider first.
WaveFileWriter supports creating a 16 bit WAV file directly from an ISampleProvider with the new CreateWaveFile16 static method.
IWavePosition interface implemented by several IWavePlayer classes allows greater accuracy of determining exact position of playback. Contribution courtesy of ioctlLR
AIFF File Writer (courtesy of Gaiwa)
Added the ability to add a local ACM driver allowing you to use ACM codecs without installing them. Use AcmDriver.AddLocalDriver
ReadFully property allows you to create never-ending MixingSampleProvider, for use when dynamically adding and removing inputs.
WasapiOut now allows setting the playback volume directly on the MMDevice.
Support for sending MIDI Sysex messages, thanks to Marcel Schot
A new BiQuadFilter for easy creation of various filter types including high pass, low pass etc
A new EnvelopeGenerator class for creating ADSR envelopes based on a blog post from Nigel Redmon.
Lots of bugfixes (see the commit history for more details). Some highlights include…
Various code cleanups including removal of use of ApplicationException, and removal of all classes marked as obsolete.
Preview Release of WinRT support. The NAudio nuget package now includes a WinRT version of NAudio for Windows 8 store apps. This currently supports basic recording and playback. This should still very much be thought of as a preview release. There are still several parts of NAudio (in particular several of the file readers and writers) that are not accessible, and we may need to replace the MFT Resampler used by WASAPI with a fully managed one, as it might mean that Windows Store certification testing fails.
- Use WasapiOutRT for playback
- Use WasapiCaptureRT for record (thanks to Kassoul for some performance enhancement suggestions)
- There is a demo application in the NAudio source code showing record and playback

NAudio Training

Another thing I’ve wanted to do for ages is to create some really good training materials for NAudio. I began this with the release of my Digital Audio Fundamentals course on Pluralsight, which is all about giving you the background understanding you need in order to work effectively with NAudio. And I’m on the verge of releasing the follow-on course which is around 6 hours of in-depth training on how to use NAudio. Keep an eye on my Pluralsight Author page and you should see it in a couple of weeks. I’ll also announce it here on my blog. You do need to be a Pluralsight subscriber to watch these videos, but it’s only $29 for a month’s subscription, which also gives you access to their entire library, making it a great deal.

What’s next?

I’ve still got lots of plans for NAudio. Here’s a few of the things near the top of my list for NAudio 1.8:

Finish Windows Store support. In particular, this means ensuring it passes Windows App Certification
I’d really like to add a fully managed resampler, which would help out in a lot of scenarios. I’ve got one ported from C already, but sadly it’s license (LGPL) isn’t compatible with NAudio’s.
More Media Foundation features. In particular, supporting reading from and writing to streams.
I’ll probably obsolete the DirectX Media Object components, as they serve little purpose now that we have MediaFoundation support
More managed effects implemented as ISampleProvider
I have some ideas for making a fluent API for NAudio to make it much easier to construct complex signal chains in a single line.

Hope you find NAudio useful. Don’t forget to get in touch and tell me what you’ve built with it.

Monday 30 September 2013

The Five Rules of ReSharper

I recently managed to persuade my work to get ReSharper licences for our development team. I’ve been using it since I won a free copy at Developer South Coast, and after being initially opposed to the idea of such a heavy-duty Visual Studio extension, it has won me over and I’m relying on it increasingly in my daily work.

Last week I ran some training sessions for the developers at work to introduce them to its features, but also to warn against some ways in which the tool can be misused. Here’s my top five rules for using ReSharper (or any productivity tool for that matter). Let me know in the comments if you have any additional rules you’d add to the list.

1. It’s Your Code

ReSharper can offer to delete unused methods or fields, or to refactor a loop into a single LINQ statement. It provides several automatic refactorings such as renaming things, extracting methods etc. On the whole these seem to be extremely reliable, and before long you’ll be triggering these refactorings without a second thought.

But when you commit, it is your responsibility to ensure that the code works fully. Of course, the best way of doing this is having unit tests that you can run, so you can prove that when ReSharper deletes unused code, or refactors a loop, that everything still runs as expected. If you check in broken code, it is your responsibility alone - you can’t blame the tool. ReSharper sometimes gets it wrong (especially when detecting “redundant” code), but it never forces you to commit to source control. It’s your job to test that everything is still working as expected before committing your changes.

2. They’re Only Suggestions

ReSharper has a superb way of presenting the issues that it has found with the source file you are currently editing, by presenting a bar showing where in the file various errors and warnings are.

If you manage to eliminate them all, you are rewarded with a nice green tick, telling you that your code is now perfect. But this can lead to problems for people with an OCD tendency. They become obsessed by an overwhelming desire to make all of the warnings go away.

What you need to constantly bear in mind is that these are only suggestions. And sometimes the code is actually better like it is. Resist the urge to make a change that you don’t agree with. If seeing the warning sign really upsets you then R# helpfully allows you to supress warnings by use of a comment.

3. Keep code cleanup commits separate from regular commits

Fixing up the warnings that R# finds is so quick and easy, that you can find yourself wandering through a source file fixing all its suggestions without even thinking about it. The trouble comes when you ask someone to code review your bug-fix. They end up having to examine a huge diff, mostly nothing to do with fixing the bug. The solution here is simply to keep your bug-fix changes and your code cleanup changes separate. Or at least restrict the code cleanup to the immediate vicinity of the bug-fix.

4. Avoid code cleanup in maintenance branches

If like us you have to maintain multiple legacy versions of your software then you probably have to merge bug-fixes between branches on a regular basis. For those merges to go well, you want minimal impact to code, and code cleanup (especially renaming things) can end up causing widespread changes which can make merges painful. So the approach I have adopted for our team is to make minimal changes to legacy branches, but allow the code in the very latest version to be cleaned up as it is worked on.

5. Get it clean on first commit

Of course, the ideal is that when you write new code, you implement (or suppress) all of the R# suggestion before committing to source control. That way my rules #3 and #4 become redundant. Make sure you set up a team shared R# settings file so everyone is working to the same standards.

Saturday 27 July 2013

Announcing Digital Audio Fundamentals

I get several emails every day asking me for help with NAudio, and often it is clear that the real issue is that many developers only have a rather vague grasp of the concepts of digital audio. So I wanted to create a really good training course that would introduce and clearly explain all the things I think are really important to developers working with digital audio in general, and NAudio in particular.

So I’m really pleased to announce the availability of my Digital Audio Fundamentals course on Pluralsight. The course begins by explaining the essential topic of sampling. In particular you need to understand sample rates and bit depths if you are to be successful with NAudio. I then talk about various file formats and lots of different codecs, before looking at several types of effects. The final module may in fact be the most important for NAudio developers as I explain the concept of “signal chains” or “pipelines”. To get things done with NAudio you need to have a clear grasp of the signal chain that you are working with. NAudio tries to make it easy for you to construct a signal chain that does exactly what you need out of many simple components.

So I’d really encourage anyone who wants to use NAudio and is able to view this course to do so. It’s just under 3 hours, but it will be a very worthwhile investment of your time. I’m hoping that maybe later this year there will be a follow-up course focusing specifically on using NAudio.

I know some of you may be disappointed that this is not a free course, but I think you’ll find the Pluralsight pricing to be very reasonable. You could always sign up for just one month to watch this course, and get the added benefit of tapping into their vast library of excellent courses on a wide variety of programming topics.

Thursday 27 June 2013

Lunchtime LINQ Challenge Answers

I had a number of people attempt the Lunchtime LINQ Challenge I set yesterday, and so I thought I’d follow up with a post about each problem, and what my preferred approach was. With each problem I saw a wide variety of different approaches, and the majority were correct (apart from the odd off-by-one error). LINQ is very powerful, and chaining operators can let you achieve a lot, but you can risk creating incomprehensible code. So I was looking for answers that were succinct but also readable.

Problem 1: Numbering Players

1. Take the following string "Davis, Clyne, Fonte, Hooiveld, Shaw, Davis, Schneiderlin, Cork, Lallana, Rodriguez, Lambert" and give each player a shirt number, starting from 1, to create a string of the form: "1. Davis, 2. Clyne, 3. Fonte" etc

This one wasn’t too hard, and was designed to highlight the fact that the LINQ Select method lets you pass in a lambda that takes two parameters – the first is the item from the input IEnumerable, and the second is an index (zero based). Here was my solution to this problem:

String.Join(", ",
    "Davis, Clyne, Fonte, Hooiveld, Shaw, Davis, Schneiderlin, Cork, Lallana, Rodriguez, Lambert"
        .Split(',')
        .Select((item, index) => index+1 + "." + item)
        .ToArray())

Problem 2: Order by Age

2. Take the following string "Jason Puncheon, 26/06/1986; Jos Hooiveld, 22/04/1983; Kelvin Davis, 29/09/1976; Luke Shaw, 12/07/1995; Gaston Ramirez, 02/12/1990; Adam Lallana, 10/05/1988" and turn it into an IEnumerable of players in order of age (bonus to show the age in the output).

This basic problem isn’t too hard to solve, but calculating the age proved rather tricky. Various attempts were made such as dividing the days old by 365.25, but the only one that worked for all test cases came from this StackOverflow answer. The trouble is, inserting this code snippet into a LINQ statement would make it quite cumbersome, so my preference would be to create a small helper method or even an extension method:

public static int GetAge(this DateTime dateOfBirth)
{
    DateTime today = DateTime.Today;
    int age = today.Year - dateOfBirth.Year;
    if (dateOfBirth > today.AddYears(-age)) age--;
    return age;
}

With that extension method in place, we can create much more readable code. Note that to sort players by age properly, the OrderBy should operate on the date of birth, rather than on the age.

"Jason Puncheon, 26/06/1986; Jos Hooiveld, 22/04/1983; Kelvin Davis, 29/09/1976; Luke Shaw, 12/07/1995; Gaston Ramirez, 02/12/1990; Adam Lallana, 10/05/1988"
.Split(';')
.Select(s => s.Split(','))
.Select(s => new { Name = s[0].Trim(), Dob = DateTime.Parse(s[1].Trim()) })
.Select(s => new { Name = s.Name, Dob = s.Dob, Age = s.Dob.GetAge() })
.OrderByDescending (s => s.Dob)

Problem 3: Sum Timespans

3. Take the following string "4:12,2:43,3:51,4:29,3:24,3:14,4:46,3:25,4:52,3:27" which represents the durations of songs in minutes and seconds, and calculate the total duration of the whole album

The main challenges here are first turning the strings into TimeSpans, and then summing TimeSpans, since you can’t use LINQ’s Sum method on an IEnumerable<TimeSpan>. The first can be done with TimeSpan.ParseExact, or you could add an extra “0:” on the beginning to get it into the format that TimeSpan.Parse expects. Several people ignored timespans completely, and simply parsed out the minutes and seconds themselves, and added up the total number of seconds. This is OK, although not as extensible for changes to the input string format such as the introduction of millisecond components. The summing of TimeSpans can be done quite straightforwardly with the LINQ Aggregate method. Here’s what I came up with:

"4:12,2:43,3:51,4:29,3:24,3:14,4:46,3:25,4:52,3:27"
    .Split(',')
    .Select(s => TimeSpan.Parse("0:" + s))
    .Aggregate ((t1, t2) => t1 + t2)

Problem 4: Generate Coordinates

4. Create an enumerable sequence of strings in the form "x,y" representing all the points on a 3x3 grid. e.g. output would be: 0,0 0,1 0,2 1,0 1,1 1,2 2,0 2,1 2,2

This one is nice and easy. I expected this one to be solved using a SelectMany, but I did get one answer that just used a Select:

Enumerable.Range(0,9)
    .Select(x => string.Format("{0},{1}", x / 3, x % 3))

This works fine, although it would be a little more involved to change the maximum x and y values. The answer I was expecting from most people just uses two Enumerable.Range and a SelectMany

Enumerable.Range(0,3)
.SelectMany(x => Enumerable.Range(0,3)
    .Select(y => String.Format("{0},{1}",x,y)))

But I did get a few responses that use the alternative LINQ syntax, and although I tend to prefer the chained method calls approach, in this case I think it makes for easier to read code:

from i in Enumerable.Range(0, 3)
from j in Enumerable.Range(0, 3)
select String.Format("{0}, {1}", i, j);

Problem 5: Swim Length Times

5. Take the following string "00:45,01:32,02:18,03:01,03:44,04:31,05:19,06:01,06:47,07:35" which represents the times (in minutes and seconds) at which a swimmer completed each of 10 lengths. Turn this into an enumerable of timespan objects containing the time taken to swim each length (e.g. first length was 45 seconds, second was 47 seconds etc)

This one was by far the hardest to implement as a single LINQ statement, since you needed to compare the current value with the previous value to calculate the difference. Perhaps the easiest trick is to Zip the sequence with itself delayed by one:

("00:00," + splitTimes).Split(',')
.Zip(splitTimes.Split(','), (s,f) => new 
    { Start = TimeSpan.Parse("0:" + s), 
      Finish = TimeSpan.Parse("0:" + f) })
.Select (q =>  q.Finish-q.Start)

The disadvantage of this approach is that you are enumerating the input sequence twice. Obviously for a string input it does not matter, but in some situations you cannot enumerate a sequence twice. I did receive two ingenious solutions that used the Aggregate function to avoid the double enumeration:

"00:45,01:32,02:18,03:01,03:44,04:31,05:19,06:01,06:47,07:35"
    .Split(new char[] { ','})
    .Select(x => "00:"+x)
    .Aggregate((acc, z) => acc + "," + TimeSpan.Parse(z.Dump()).Add(new TimeSpan(0,0,- 
        acc.Split(new char[] { ',' })
            .Sum(x => (int)TimeSpan.Parse(x).TotalSeconds))))    
    .Split(new char[] {','})
    .Select(x => TimeSpan.Parse(x))

and

.Split(',')
.Aggregate("00:00:00", (lapsString, lapTime) => 
    string.Format("{0}-00:{1},00:{1}", lapsString, lapTime), result => result.Substring(0, result.Length - 9))
.Split(',')
.Select(lap => DateTime.Parse(lap.Substring(9)) - DateTime.Parse(lap.Substring(0,8)))

But to be honest, I think that this is another case for a helper extension method. Suppose we had a ZipWithSelf method that allowed you to operate on the sequence in terms of the current and previous values:

public static IEnumerable<X> ZipWithSelf<T,X>(this IEnumerable<T> inputSequence, Func<T,T,X> selector)
{
    var last = default(T);
    foreach(var input in inputSequence)
    {
        yield return selector(last, input);
        last = input;
    }
}

then we could make the LINQ statement read very naturally and only require a single enumeration of the sequence:

"00:45,01:32,02:18,03:01,03:44,04:31,05:19,06:01,06:47,07:35"
    .Split(',')    
    .Select(x => TimeSpan.Parse("00:"+x))
    .ZipWithSelf((a,b) => b - a)

Problem 6: Ranges

6. Take the following string "2,5,7-10,11,17-18" and turn it into an IEnumerable of integers: 2 5 7 8 9 10 11 17 18

This one represents a real problem I had to solve a while ago, and after a few iterations, found that it could be solved quite succinctly in LINQ. The trick is to turn it into an enumeration of start and end values. For the entries that aren’t ranges, just have the same value for start and end. Then you can do a SelectMany combined with Enumerable.Range:

"2,5,7-10,11,17-18"
    .Split(',')
    .Select(x => x.Split('-'))
    .Select(p => new { First = int.Parse(p[0]), Last = int.Parse(p.Last()) })
    .SelectMany(r => Enumerable.Range(r.First, r.Last - r.First + 1))

Wednesday 26 June 2013

Lunchtime LINQ Challenge

I do regular lunchtime developer training sessions at my work, and this week I created a short quiz for the developers to try out their LINQ skills. They are not too hard (although number 5 is quite tricky) and most are based on actual problems I needed to solve recently. I thought I’d post them here too in case anyone fancies a short brain-teaser. If you attempt it, why not post your answers to a Github gist and link to them in the comments. I’ll hopefully do a followup post with some thoughts on how I would attempt these. Also, feel free to suggest additional interesting LINQ problems of your own…

The Challenge…

Each of these problems can be solved using a single C# statement by making use of chained LINQ operators (although you can use more statements if you like). You'll find the String.Split function helpful to get started on each problem. Other functions you might need to use at various points are String.Join, Enumerable.Range, Zip, Aggregate, SelectMany. LINQPad would be a good choice to try out your ideas.

1. Take the following string "Davis, Clyne, Fonte, Hooiveld, Shaw, Davis, Schneiderlin, Cork, Lallana, Rodriguez, Lambert" and give each player a shirt number, starting from 1, to create a string of the form: "1. Davis, 2. Clyne, 3. Fonte" etc

3. Take the following string "4:12,2:43,3:51,4:29,3:24,3:14,4:46,3:25,4:52,3:27" which represents the durations of songs in minutes and seconds, and calculate the total duration of the whole album

4. Create an enumerable sequence of strings in the form "x,y" representing all the points on a 3x3 grid. e.g. output would be: 0,0 0,1 0,2 1,0 1,1 1,2 2,0 2,1 2,2

6. Take the following string "2,5,7-10,11,17-18" and turn it into an IEnumerable of integers: 2 5 7 8 9 10 11 17 18

Friday 17 May 2013

Code Reviews–What are we looking for?

A few years back I wrote about the issue of when to perform code reviews. Code reviews which happen late in the development cycle can be a waste of time, since it may be too late to act on the findings. However, there is another, much bigger, problem with code reviews, and that is the issue of what are we supposed to be looking for in the first place?

At first this might seem to be a non-issue. After all, the reviewer is simply looking for problems of any sort. But the trouble is that there are so many different types of problem a piece of code might contain that it would take a very disciplined reviewer to consciously check for each one. It would also take a very long time. The reality is that most code reviewers are only checking for a small subset of the potential issues.

So what sorts of things ought we to be looking for?

Code Standards

I put this one first as it is the one thing that almost all code reviewers seem to be good at. If you break the whitespace rules, variable naming conventions, or miss out a function or class comment, you can be sure the code reviewer will point this out.

However, these are things that a good tool such as ReSharper ought to be finding for you. Human code reviewers ought to be focusing their attention on more significant problems.

Language Paradigms

Other things an experienced developer will be quick to pick up on are when you are using the language in a sub-optimal way. In C#, this includes things like correct use of using statements and the dispose pattern, using LINQ where appropriate, knowing when to use IEnumerable and when to use an array, and making types immutable where appropriate.

Observations like this are particularly useful to help train junior developers to improve the way. But again, many of these issues could be discovered by a good code analysis tool.

Maintainability

In a large system, keeping code maintainable is as critical as keeping it bug free. So a code review should look for problems such as high code complexity. I almost always point out a few places in which the code I review could be made “cleaner” by refactoring into shorter methods with descriptive names.

Uncle Bob Martin’s “SOLID” acronym is also a helpful guide. Are there bloated classes with too many responsibilities? Are we constantly making changes to the same class because it isn’t “open to extensibility”? Are we violating the “Liskov Substitution Principle” by checking if an object is an instance of a specific derived type?

Noticing maintainability problems at code review time, such as tight or hidden coupling through the use of singletons, may require some substantial re-work of the feature, but it will be much less painful to do it while it is still fresh in the developer’s mind rather than having to sort the mess out later on in the project lifetime.

Project Conventions

Every software project will have some kind of organization and structure, and in an ideal world, all developers are aware of this and make their changes in the “right” place and the right way. However, with large projects, it can be possible for developers to be unaware of the infrastructure that is in place. For example, do they ignore the error reporting mechanism in favour of re-inventing their own? Do they know about the IoC container and how to use it? Did they keep the business logic out of the view code?

For issues like this to be picked up, the code reviewer needs to be someone familiar with the original design of the code being modified. It may be necessary on some occasions for the code reviewer to ask the developer not to copy an existing convention because it has been discovered to be problematic.

Finding Bugs

Most discussions of code reviews assume that finding bugs is the main purpose, but from the list of items I’ve already mentioned it is not hard to see how we could actually forget to look for them.

This involves visually inspecting the code looking for things like potential null reference exceptions, or off by one errors. A lot of it is about checking whether the right thing happens when errors are encountered.

The trouble is that if the code is highly complex or badly structured, then it may be close to impossible to find bugs with a visual inspection. This is why the “maintainability” part of the code review is so important. Giray Özil sums this problem up brilliantly:

“Ask a programmer to review 10 lines of code, he'll find 10 issues. Ask him to do 500 lines and he'll say it looks good.”

Meeting Requirements

If the code reviewer doesn’t have a firm grasp of the actual requirements that the coder was trying to implement, it is quite possible that code that simply doesn’t meet the requirements gets through the code review. The more understanding a code reviewer has of the actual business problem this code is trying to solve, the more likely they are to spot flaws (or gaping holes) in its implementation.

User Interface

Now we move onto some areas for review that relate to specific types of code. The first is user interface code. This adds a whole host of things for a reviewer to check such as correct layout of controls, correct use of colour, all localisable text read from resource files, etc. There will likely be some established conventions in place for user interface, and for a professional looking project, you need to ensure they are adhered to.

There is the also need to review user interfaces for usability. Developers are notoriously bad at designing good UIs, so a separate review with a UX expert might be appropriate.

Threading

Multi-threaded code is notoriously hard to get right. Are there potential race conditions, dead-locks, or resources that should be protected with a lock? It is important that a code reviewer is aware of which parts of the code might be used in a multi-threaded scenario, and give special attention to those parts. Again, the code needs to be as simple as possible. A 4000 line class that could be called on multiple different threads from dozens of different places will be close to impossible to verify for thread safety.

Testing

A good code reviewer should be asking what has been done to test the code under review, including both manual and automated tests. If there are unit tests, they too ought to be code reviewed for correctness and maintainability. Sometimes the best thing a code reviewer can do is suggest additional ways in which the code ought to be tested (or made more testable – the preference should always be for automated tests where possible).

Security

One of the most commonly overlooked concerns in code reviews is security. There are the obvious things like checking the code isn’t vulnerable to a SQL injection attack, or that the passwords aren’t stored in plaintext, or that the admin-level APIs can only actually be run by someone with the correct privileges. But there are countless subtle tricks that hackers have up their sleeves, and really your code will need regular security reviews by experts.

Performance

This is another commonly overlooked concern. Of course, much of our code doesn’t need to be aggressively performance tuned, but that doesn’t mean we can get away without thinking about performance. This requires knowledge of what sort of load the system will be under in “real-world” customer environments.

Compatibility

Another big issue particularly with large enterprise systems that the code may need to run on all kinds of different operating systems, or against different databases. It might need to be able to load serialized items from previous versions of the software, or cope with upgrading from the previous version of the database. If these problems are not spotted in code review, they can cause significant disruption in the field as customers find that the new version of the software doesn’t work in their environment.

Where possible, automated tests should be guarding against compatibility breakages, but it is always worth considering these issues in a code review.

Merging

If you are in the situation where you need to actively develop and maintain multiple versions of your product, then a code reviewer needs to consider the impact of these changes on future merges. As nice as it might be to completely reformat all the code, rename all the variables and reorganize all the code into new locations, you may also be making it completely impossible for any future merges to succeed.

Summary

Looking at the list above you might start to think that a code reviewer has no hope of properly covering all the bases. For this reason I think it is important that we clarify what a particular code review is intended to find. There may need to be special security, performance or UI focused reviews. I also think we should be using automated tools wherever possible to find as many of these issues for us.

I’d love to hear your thoughts on this topic. Which of these areas ought a code review to focus on? Have I missed any areas?

Wednesday 15 May 2013

Knockout - What MVVM should have been like in XAML

I've not got a lot of JavaScript programming experience, but I've been learning a bit recently, and decided to try knockout.js. I was pleasantly surprised at how quickly I was able to pick it up. In part this is because it follows a MVVM approach very similar to what I am used to with WPF. I've blogged a three-part MVVM tutorial here before as well as occasionally expressing my frustrations with MVVM. So I was interested to see what the MVVM story is like for Javascript developers. What I wasn't expecting, was to find that MVVM in JavaScript using knockout.js is a much nicer experience than I was used to with XAML + C#. In this post I'll run through a few reasons why I was so impressed with knockout.

Clean Data-Binding Syntax

The first impressive thing is how straightforward the binding syntax is. Any HTML element can have a data-bind property attached to it, and that can hold a series of binding expressions. Here's a simple binding expression in knockout that puts text from your viewmodel into a div:

<div data-bind="text: message" ></div>

To bind multiple properties, you can just add extra binding statements into the single data-bind attribute as follows:

<button data-bind="click: prev, enable: enablePrev">

In XAML the syntax for basic bindings isn't too cumbersome, but a bit more repetitive nonetheless:

<TextBox Text="{Binding Message}" 
Enabled="{Binding EnableEditMessage}" />

Event Binding

One frustration with data binding in XAML is that you can't bind a function directly to an event. So for example when an animation finishes, it would be great to be able to call into a method on our ViewModel with something like this:

<Storyboard Completed="{Binding OnFinished}" />

Sadly, that is invalid XAML, but with knockout.js, binding to any event is trivial:

<div data-bind="click: handleItem"/>

Arbitrary expressions in binding syntax

XAML data binding does allow you to drill down into members of properties on your ViewModel. For example, you can do this:

<TextBlock Text="{Binding CurrentUser.LastName}" />

And it also does give you the ability to do a bit of string formatting, albeit with a ridiculously hard to remember syntax:

<TextBlock Text="{Binding Path=OrderDate, StringFormat='{}{0:dd.MM.yyyy}'}" />

But you can't call into a method on your ViewModel, or write expressions like this

<Button IsEnabled="{Binding Items.Count > 0}" />

However, in knockout.js, we have the freedom to write arbitrary bits of JavaScript right there in our binding syntax. It's simple to understand, and it just works:

<button data-bind="click: next, enable: index() < questions.length -1">Next</button>

This is brilliant, and it keeps the ViewModel from having to do unnecessary work just to manipulate data into the exact type and format needed for the data binding expression. (Anyone who's done MVVM with XAML will be all too familiar with the task of of turning booleans into Visibility values using converters)

Property Changed Notifications

Obviously, knockout needs some way for the ViewModel to report that a property has changed, so that the view can update itself. In the world of XAML this is done via the INotifyPropertyChanged interface, which is cumbersome to implement, even if you have a ViewModelBase class that you can ask to do the raising of the event for you.

private bool selected;
public bool Selected 
{
   get { return selected; }
   set   
   {
        if (selected != value)
        {
            selected = value;
            OnPropertyChanged("Selected");
        }
   }
}

Contrast this with the gloriously straightforward knockout approach which uses a ko.observable:

this.selected = ko.observable(false);

Now selected is a function that you call with no arguments to get the current value, and with a boolean argument to set its value. It's delightfully simple. I can't help but wonder if a similar idea could be somehow shoehorned into C#.

To be fair to the XAML MVVM world, you can alleviate some of the pain with Simon Cropp's superb Fody extension, which allows you to add in a compile time code weaver to automatically raise PropertyChanged events on your ViewModels. I use this on just about all my WPF projects now. It is a great timesaver and leaves your code looking a lot cleaner to boot. However, in my opinion, if you have to use code weaving its usually a sign that your language is lacking expressivity. I'd rather directly express myself in code.

Computed properties

Computed properties can be a pain in C# as well, as you have to remember to explicitly raise the PropertyChanged notification. (Although Fody is very powerful in this regard and can spot that a property getter on your ViewModel depends on other property getters.) Here's an example of a calculated FullName property in a C# ViewModel:

private string firstName;
public string FirstName 
{
   get { return firstName; }
   set   
   {
        if (firstName!= value)
        {
            firstName= value;
            OnPropertyChanged("FirstName");
            OnPropertyChanged("FullName");
        }
   }
}

public string FullName 
{
    get { return FirstName + " " + Surname; }
}

Knockout's solution to this is once again elegant and simple. You simply declare a ko.computed type on your ViewModel:

this.fullName = ko.computed(function() {
     return this.firstName() + " " + this.lastName();
}, this);

Elegant handling of binding to parent and root data contexts

Another area that causes regular pain for me with XAML databinding is when you need to bind to your parent's context or the root context. I think I've just about got the syntax memorized now, but I must have searched for it on StackOverflow hundreds of times before it finally stuck. You end up writing monstrosities like this:

...Binding="{Binding RelativeSource={RelativeSource FindAncestor, 
AncestorType={x:Type Window}}, Path=DataContext.AllowItemCommand}" ...

In knockout, once again, the solution is simple and elegant, allowing you to use $parent to access your parent data context (and grandparents with $parent[1] etc), or $root. Read more about knockout's binding context here.

<div data-bind="foreach: currentQuestion().answers">
    <div data-bind="html: answer, click: $parent.currentQuestion().select"></div>
</div>

Custom binding extensions!

Finally, the killer feature. If only we could add custom binding expressions to XAML, then maybe we could work around some of its limitations and upgrade it with powerful capabilities. Whilst I have a feeling that this is in fact technically possible, I don't know of anyone who has actually done it. Once again knockout completely knocks XAML out of the water on this front, with a very straightforward extensibility model for you to create your own powerful extensions. If you run through the knockout tutorial, you'll implement one yourself and be amazed at how easy it is.

Conclusion

I've only used knockout.js for a few hours (here's what I made) and all I can say is I am very jealous of its powers. This is what data binding in XAML ought to have been like. XAML has been around for some time now, but there have been very few innovations in the data-binding space (we have seen the arrival of behaviours, visual state managers, merged dictionaries, format strings, but all of them suffer from the same clunky, hard to remember syntax). And now we have another subtly different flavour of XAML to learn for Windows Store 8 apps, it seems that XAML is here to stay. Maybe it is time for us to put some pressure onto Microsoft to give XAML data binding power to rival what the JavaScript community seem to be able to take for granted.

Thursday 28 March 2013

Thoughts on the demise of Google Reader (and Blogging)

I started blogging in 2004. The ability to add new articles to my website without the laborious task of modifying HTML files and FTPing them up to my webspace was nothing short of magical. Even better was the community that suddenly sprung up around shared interests. If you read something interesting on someone else’s blog, you could comment, but even better, you could write your own post in response, linking to theirs, and a “trackback” would notify them, and a link to your response would appear at the bottom of their post. It was a great way of finding like-minded people.

But spammers quickly put an end to all that, and it wasn’t long before trackbacks were turned off for most blogs. Your only interaction came in the comments, and even that was less than ideal as so few blogs supported notification for comments. Blog posts were no longer conversation starters, they were more like magazine articles.

The next major setback for blogging was twitter. With an even quicker way to get your thoughts out to the world, many bloggers (myself included to be honest) started to neglect their blogs. In one sense, this is no big deal. I’d rather follow many blogs that have infrequent but interesting posts, rather than a few that have loads of posts of low quality. Which is why I love RSS and Google Reader. Some of the people I follow only blog a few times a year. But when they do write something, I know immediately, and can interact with them in the comments.

Now Google Reader is going away and this could be the killer blow for many small, rarely updated blogs. Of my 394 subscribers (according to feedburner), 334 are using Google Reader. I wonder how many I’ll have left after July 1st? Sure, there are a good number of us who are researching alternative products, but there are also many non-technical users of Google Reader. Take my wife for example. She likes and uses Google Reader, but doesn’t really want the hassle of switching and could easily miss the deadline (unless I transfer her subscriptions for her). Her response when I told her Google Reader was being shut down: “Why don’t they shut down Google Plus instead? No one uses that.” My thoughts exactly.

Now it is true that I get a lot more traffic from search engines than I do from subscribers. Google regularly sends people here to look at posts I made five years ago about styling a listbox in Silverlight 2 beta 2. But for me, part of the joy of blogging is interacting with other bloggers and readers about topics you are currently interested in. Without subscribers, you need to not only blog, but announce your every post on all the social networking sites you can access, quite possibly putting off potential readers with your constant self-promoting antics. If you choose not to do this, then you could easily find that no one at all is reading your thoughts. And then you start to question whether there is any point at all in blogging.

Google Reader Alternatives

So what are the options now that Google Reader is going? It does seem that there are a few viable replacement products – feedly and newsblur are rising to the challenge, offering quick ways to import your feeds. Apparently Digg are going to try to build an alternative before the cut-off date. But one thing all these options have in common is that they are scrambling to scale and add features fast enough to meet a very challenging deadline. There is no telling which of them will succeed, or come up with a viable long-term business model (how exactly do the free options plan to finance their service?), or offer the options we need to migrate again if we decide we have made the wrong choice. I could easily still be searching for an alternative long after Google Reader is gone. And then there is the integration with mobile readers. I use Wonder Reader on the Windows Phone. I have no idea whether it will continue to be usable in conjunction any of these new services.

Or I could think outside the box. Could I write my own RSS reader and back-end service in the cloud just for me? Possibly, and I can’t say I haven’t been tempted to try, but I have better things to do with my time. Or how about (as some have apparently already done), giving up altogether on RSS, and just get links from Twitter, or Digg, or from those helpful people who write daily link digests (the Morning Brew is a favourite of mine)? I could, and perhaps I would find some new and cool stuff I wouldn’t have seen in Google Reader. But there is nothing as customised to me as my own hand-selected list of blog subscriptions. There aren’t many people who share my exact mix of interests (programming, theology, football, home studio recording), and it would be a great shame to lose touch with the rarely updated blogs of my friends. And that’s to say nothing of the other uses of RSS such as being notified of new issues raised on my open source projects at CodePlex, or following podcasts. In short, I’m not ready to walk away from RSS subscriptions yet. At least, not until there is something actually better.

What’s Next To Go?

The imminent closure of Google Reader leaves me concerned about two other key components of my blogging experience which are also owned by Google – Feedburner and Blogger (Blogspot). I chose blogger to host this blog as I felt sure that Google would invest heavily in making it the best blogging platform on the internet. They haven’t. I have another blog on WordPress and it is far superior. I’ve been researching various exit strategies for some time (including static blogging options like octopress) but as with RSS feed readers, migrating to an alternative blog provider is not a choice to be taken lightly. Even more concerning is that feedburner was part of my exit strategy – I can use it to make the feed point to any other site easily. If Google ditch that, I’ll lose all my subscribers regardless of what reader they are using. It is rather concerning that Google have the power to deal a near-fatal blow to the entire blogging ecosystem should they wish.

What I’d Like To See

Congratulations if you managed to get this far, and apologies for the gloomy nature of this post so far. So why don’t I end it with a blogging challenge? What I’d like to see is posts from gurus in all things cloud, database, nosql, html5, nodejs, javascript, etc on how they would go about architecting a Google Reader replacement. What existing open source components are ready made building blocks? Would you build it on Azure table storage, or perhaps with RavenDB? How would you efficiently track items read, starred and tagged? What technologies would you use to make a reader interface that works for desktop and mobile? I’m not much of a web developer myself, so I’d love to see some cool open source efforts in this area, even if they are of the self-host variety.

Wednesday 27 March 2013

How to convert byte[] to short[] or float[] arrays in C#

One of the challenges that frequently arises when writing audio code in C# is that you get a byte array containing raw audio that would be better presented as a short (Int16) array, or a float (Single) array. (There are other formats too – some audio is 32 bit int, some is 64 bit floating point, and then there is the ever-annoying 24 bit audio). In C/C++ the solution is simple, cast the address of the byte array to a short * or a float * and access each sample directly.

Unfortunately, in .NET casting byte arrays into another type is not allowed:

byte[] buffer = new byte[1000];
short[] samples = (short[])buffer; // compile error!

This means that, for example, in NAudio, when the WaveIn class returns a byte[] in its DataAvailable event, you usually need to convert it manually into 16 bit samples (assuming you are recording 16 bit audio). There are several ways of doing this. I’ll run through five approaches, and finish up with some performance measurements.

BitConverter.ToInt16

Perhaps the simplest conceptually is to use the System.BitConverter class. This allows you to convert a pair of bytes at any position in a byte array into an Int16. To do this you call BitConverter.ToInt16. Here’s how you read through each sample in a 16 buffer:

byte[] buffer = ...;
for(int n = 0; n < buffer.Length; n+=2)
{
   short sample = BitConverter.ToInt16(buffer, n);
}

For byte arrays containing IEEE float audio, the principle is similar, except you call BitConverter.ToSingle. 24 bit audio can be dealt with by copying three bytes into a temporary four byte array and using ToInt32.

BitConverter also includes a GetBytes method to do the reverse conversion, but you must then manually copy the return of that method into your buffer.

Bit Manipulation

Those who are more comfortable with bit manipulation may prefer to use bit shift and or to convert each pair of bytes into a sample:

byte[] buffer = ...;
for (int n = 0; n < buffer.Length; n+=2)
{
   short sample = (short)(buffer[n] | buffer[n+1] << 8);
}

This technique can be extended for 32 bit integers, and is very useful for 24 bit, where none of the other techniques work very well. However, for IEEE float, it is a bit more tricky, and one of the other techniques should be preferred.

For the reverse conversion, you need to write more bit manipulation code.

Buffer.BlockCopy

Another option is to copy the whole buffer into an array of the correct type. Buffer.BlockCopy can be used for this purpose:

byte[] buffer = ...;
short[] samples = new short[buffer.Length];
Buffer.BlockCopy(buffer,0,samples,0,buffer.Length);

Now the samples array contains the samples in easy to access form. If you are using this technique, try to reuse the samples buffer to avoid making extra work for the garbage collector.

For the reverse conversion, you can do another Buffer.BlockCopy.

WaveBuffer

One cool trick NAudio has up its sleeve (thanks to Alexandre Mutel) is the “WaveBuffer” class. This uses the StructLayout=LayoutKind.Explicit attribute to effectively create a union of a byte[], a short[], an int[] and a float[]. This allows you to “trick” C# into letting you access a byte array as though it was a short array. You can read more about how this works here. If you’re worried about its stability, NAudio has been successfully using it with no issues for many years. (The only gotcha is that you probably shouldn’t pass it into anything that uses reflection, as underneath, .NET knows that it is still a byte[], even if it has been passed as a float[]. So for example don’t use it with Array.Copy or Array.Clear). WaveBuffer can allocate its own backing memory, or bind to an existing byte array, as shown here:

byte[] buffer = ...;
var waveBuffer = new WaveBuffer(buffer);
// now you can access the samples using waveBuffer.ShortBuffer, e.g.:
var sample = waveBuffer.ShortBuffer[sampleIndex];

This technique works just fine with IEEE float, accessed through the FloatBuffer property. It doesn’t help with 24 bit audio though.

One big advantage is that no reverse conversion is needed. Just write into the ShortBuffer, and the modified samples are already in the byte[].

Unsafe Code

Finally, there is a way in C# that you can work with pointers as though you were using C++. This requires that you set your project assembly to allow “unsafe” code. "Unsafe” means that you could corrupt memory if you are not careful, but so long as you stay in bounds, there is nothing unsafe at all about this technique. Unsafe code must be in an unsafe context – so you can use an unsafe block, or mark your method as unsafe.

byte[] buffer = ...;
unsafe 
{
    fixed (byte* pBuffer = buffer)
    {
        short* pSample = (short*)buffer;
        // now we can access samples via pSample e.g.:
        var sample = pSample[sampleIndex];
    }
}

This technique can easily be used for IEEE float as well. It also can be used with 24 bit if you use int pointers and then bit manipulation to blank out the fourth byte.

As with WaveBuffer, there is no need for reverse conversion. You can use the pointer to write sample values directly into the memory for your byte array.

Performance

So which of these methods performs the best? I had my suspicions, but as always, the best way to optimize code is to measure it. I set up a simple test application which went through a four minute MP3 file, converting it to WAV and finding the peak sample values over periods of a few hundred milliseconds at a time. This is the type of code you would use for waveform drawing. I measured how long each one took to go through a whole file (I excluded the time taken to read and decode MP3). I was careful to write code that avoided creating work for the garbage collector.

Each technique was quite consistent in its timings:

	Debug Build	Release Build
BitConverter	263,265,264	166,167,167
Bit Manipulation	254,243,250	104,104,103
Buffer.BlockCopy	205,206,204	104,103,103
WaveBuffer	239.264.263	97,97,97
Unsafe	173.172.162	98,98,98

As can be seen, BitConverter is the slowest approach, and should probably be avoided. Buffer.BlockCopy was the biggest surprise for me - the additional copy was so quick that it paid for iteself very quickly. WaveBuffer was surprisingly slow in debug build – but very good in Release build. It is especially impressive given that it doesn’t need to pin its buffers like the unsafe code does, so it may well be the quickest possible technique in the long-run as it doesn’t hinder the garbage collector from compacting memory. As expected the unsafe code gave very fast performance. The other takeaway is that you really should be using Release build if you are doing audio processing.

Anyone know an even faster way? Let me know in the comments.

Monday 18 March 2013

Why Static Variables Are Dangerous

In an application I work on, we need to parse some custom data files (let’s call them XYZ files). There are two versions of the XYZ file format, which have slightly different layouts of the data. You need to know what version you are dealing with to know what sizes the various data structures will be.

We inherited some code which could read XYZ files, and it contained the following snippet. While it was reading the XYZ file header it stored the file version into a static variable, so that later on in the parsing process it could use that to make decisions.

public static XyzVersion XyzVersion { get; set; }

public static int MaxSizeToUse
{
    get
    {
        switch (XyzVersion)
        {
            case XyzVersion.First:
                return 8;
            case XyzVersion.Second:
                return 16;
        }

        throw new InvalidOperationException("Unknown XyzVersion");
    }
}

public static int DataSizeToSkip
{
    get
    {
        switch (XyzVersion)
        {
            case XyzVersion.First:
                return 8;
            case XyzVersion.Second:
                return 0;
        }

        throw new InvalidOperationException("Unknown XyzVersion");
    }
}

Can you guess what went wrong? For years this code worked perfectly on hundreds of customer sites worldwide. All XYZ files, of both versions were being parsed correctly. But then, we suddenly started getting customers reporting strange problems to do with their XYZ files. When we investigated it, we discovered that we now had customers whose setup meant they could be dealing with two different versions of the XYZ file. That on its own wasn’t necessarily a problem. The bug occurred when our software, on two different threads simultaneously, was trying to parse XYZ files of a different version.

So one thread started to parse a version 1 XYZ file, and set the static variable to 1. Then the other thread started to parse a version 2 XYZ file and set the static variable to 2. Now, when the first thread carried on, it now incorrectly thought it was dealing with a version 2 XYZ file, and data corruption ensued.

What is the moral of this story? Don’t use a static variable to hold state information that isn’t guaranteed to be absolutely global. This is also a reason why the singleton pattern is so dangerous. The assumption that “there can only ever be one of these” is very often proved wrong further down the road. Here the assumption was that we would only ever see one version of the XYZ files on a customer site. That was true for several years … until it wasn’t anymore.

In this case, the right approach was for each XYZ file reader class to keep track of what version it was dealing with, and pass that through to the bits of code that needed to know it (it wasn’t even a difficult change to make). Static variables get used far too often simply because they are convenient and “save time”. But any time saved coding will be lost further down the road when your “there can only be one” assumption proves false.

Friday 15 March 2013

NAudio on .NET Rocks

I was recently interviewed by Carl Franklin and Richard Campbell for their .NET Rocks podcast and the episode was published yesterday. You can have a listen here. I was invited onto the show after helping Carl out with an interesting ASIO related problem. I essentially built a mixer for his 48 in and 48 out MOTU soundcard. It is by far the most data that anyone has ever tried to push through NAudio (to my knowledge) and it did struggle a bit – he had to reduce the channel count to avoid corruption, but it was still impressive what was achieved at a low latency. However, I’m hoping to do some performance optimisations, and it would be very interesting to see if we can get 48 in and 48 (at 44.1kHz working smoothly in a managed environment). I’ll hopefully blog about it once I’ve got something working.

Friday 8 March 2013

Essential Developer Principles #4 – Open Closed Principle

The “Open Closed Principle” is usually summarised as code should be “open to extension” but “closed to modification”. The way I often express it is that when I am adding a new feature to an application, I want to as much as possible be writing new code, rather than changing existing code.

However, I noticed there has been some pushback on this concept from none other than the legendary Jon Skeet. His objection seems to be based on the understanding that OCP dictates that you should never change existing code. And I agree; that would be ridiculous. It would encourage an approach to writing code where you added extensibility points at every conceivable juncture – all methods virtual, events firing before and after everything, XML configuration allowing any class to be swapped out, etc, etc. Clearly this would lead to code so flexible that no one could work out what it was supposed to do. It would also violate another well-established principle – YAGNI (You ain’t gonna need it). I (usually) don’t know in advance in what way I’ll need to extend my system, so why complicate matters by adding numerous extensibility points that will never be used (or more likely, still need to be modified before they can be used)?

So in a nutshell, here’s my take on OCP. When I’m writing the initial version of my code, I simply focus on writing clean maintainable code, and don’t add extensibility points unless I know for sure they are needed for an upcoming feature. (so yes, I write code that doesn’t yet adhere to OCP).

But when I add a new feature that requires a modification to that original code, instead of just sticking all the new code in there alongside the old, I refactor the original class to add the extensibility points I need. Then the new feature can be added in an isolated way, without adding additional responsibilities to the original class. The benefit of this approach is that you get extensibility points that are actually useful (because they are being used), and they are more likely to enable further new features in the future.

OCP encourages you to make your classes extensible, but doesn’t stipulate how you do so. Here’s some of the most common techniques:

Pass dependencies as interfaces into your class allowing callers to provide their own implementations
Add events to your class to allow people to hook in and insert steps into the process
Make your class suitable as a base class with appropriate virtual methods and protected fields
Create a “plug-in” architecture which can discover plugins using reflection or configuration files

It is clear that OCP is in fact very closely related to SRP (Single Responsibility Principle). Violations of OCP result in violations of SRP. If you can’t extend the class from outside, you will end up sticking more and more code inside the class, resulting in an ever-growing list of responsibilities.

In summary, for me OCP shouldn’t mean you’re not allowed to change any code after writing it. Rather, it’s about how you change it when a new feature comes along. First, refactor to make it extensible, then extend it. Or to put it another way that I’ve said before on this blog, “the only real reasons to change the existing code are to fix bugs, and to make it more extensible”.