Pages
    Calendar
    <<  September 2010  >>
    MoTuWeThFrSaSu
    303112345
    6789101112
    13141516171819
    20212223242526
    27282930123
    45678910

    It’s has been a while since my last post due to vacation and common laziness – but now I’m back with a fresh post. This post concerns something that almost all programmers do on a daily basis: string concatenation.

    If you’re making a lot of string concatenations, you might experience performance problems. The problem with string concatenation is that strings in .NET are immutable.  That means that you discard the old string object and create a new one containing the concatenated string. This process requires some overhead and can have implications on the performance of the program.

    As most programmers know it can be a good idea to use the StringBuilder class when you’re concatenating many times. The rule of thumb is that the speed gained in concatenating with the StringBuilder is exceeded by the overhead in instantiating the StringBuilder object, if the number of concatenating is very low. But how big is the overhead in instantiating the StringBuilder object? And how many concatenations does it require for the StringBuilder to outperform the normal concatenation?

    The StringBuilder uses an array to store the strings and the joins the strings when the ToString() method is called. But what if you use a string array yourself and calls the Join() method when all the strings have been added – will the normal string array outperform the StringBuilder?

    To shed some light on this matter I designed some tests. I decided to test how long it took to make X concatenations and if made a difference how long the text string was.  This is the code I used:

    Imports System.IO

    Imports System.Diagnostics

     

    Partial Public Class _Default

        Inherits System.Web.UI.Page

     

        Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load

            Dim mStr() As String = {"a", "aaaaa", "aaaaaaaaaa", "aaaaaaaaaaaaaaaaaaaa"}

            Dim mNumberOfConcats() As Integer = {10000, 25000, 50000, 100000}

            Dim mStrBuilder As New StringBuilder()

            For Each mStrElement As String In mStr

                For Each mConcatNumber As Integer In mNumberOfConcats

                    Response.Write("Normal concatenating '" & mStrElement & "' " & mConcatNumber & " times:<br />Operation took: " & _

                                        ConcatNormal(mStrElement, mConcatNumber) & "<br /><br />")

                    Response.Write("StrBuilder concatenating '" & mStrElement & "' " & mConcatNumber & " times:<br />Operation took: " & _

                                        ConcatStringBuilder(mStrElement, mConcatNumber) & "<br /><br />")

                    Response.Write("Array concatenating '" & mStrElement & "' " & mConcatNumber & " times:<br />Operation took: " & _

                                        ArrayConcat(mStrElement, mConcatNumber) & "<br />_______________________________________________<br />")

                    Response.Flush()

                Next

            Next

        End Sub

     

        Private Function ArrayConcat(ByVal pStr As String, ByVal pNumberOfConcats As Integer) As Long

            Dim mStopWatch As New Stopwatch()

            Dim mStr As String

            mStr = ""

            GC.Collect()

            mStopWatch.Start()

            Dim mStrArray(pNumberOfConcats) As String

            For i = 1 To pNumberOfConcats

                mStrArray(i - 1) = pStr

            Next

            Dim mFoo As String

            mFoo = [String].Join("", mStrArray)

            ArrayConcat = mStopWatch.ElapsedMilliseconds

            mStopWatch = Nothing

        End Function

     

        Private Function ConcatNormal(ByVal pStr As String, ByVal pNumberOfConcats As Integer) As Long

            Dim mStopWatch As New Stopwatch()

            Dim mStr As String

            mStr = ""

            GC.Collect()

            mStopWatch.Start()

            For i = 1 To pNumberOfConcats

                mStr += pStr

            Next

            ConcatNormal = mStopWatch.ElapsedMilliseconds

            mStopWatch = Nothing

        End Function

     

        Private Function ConcatStringBuilder(ByVal pStr As String, ByVal pNumberOfConcats As Integer) As Long

            Dim mStopWatch As New Stopwatch()

            Dim mStr As String

            mStr = ""

            GC.Collect()

            mStopWatch.Start()

            Dim mStrBuilder As New StringBuilder()

            For i = 1 To pNumberOfConcats

                mStrBuilder.Append(pStr)

            Next

            mStr = mStrBuilder.ToString()

            ConcatStringBuilder = mStopWatch.ElapsedMilliseconds

            mStopWatch = Nothing

        End Function

    End Class

    Method / Number of concatenations

    10000

    25000

    50000

    100000

    Normal 1 char

    79

    432

    2538

    17336

    StringBuilder 1 char

    0

    0

    1

    3

    Array Join 1 char

    0

    41

    1

    3

    Normal 5 chars

    502

    5664

    24183

    95160

    StringBuilder 5 chars

    0

    1

    2

    4

    Array Join 5 chars

    0

    1

    2

    4

    Normal 10 chars

    1716

    11777

    47859

    209215

    StringBuilder 10 chars

    0

    1

    3

    7

    Array Join 10 chars

    0

    1

    2

    5

    Normal 20 chars

    4395

    24340

    105893

    454174

    StringBuilder 20 chars

    1

    2

    5

    16

    Array Join 20 chars

    2

    1

    3

    7

     

    The results are in milliseconds and clearly shows that the overhead of instansiating the StringBuilder has no measureable performance hit compared to the normal concatenation. Even when making 10000 concatenations on a 20 character string the StringBuilder only uses 1 millisecond!

    What is more interesting is the impact on performance the length of the string that you’re concatenating has on the normal concatenation. Concatenating 1 and 5 characters 10000 times is over 6 times slower and when you compare 1 and 10 characters it is 21 times slower.

    The reason that we even bother thinking about performance is that humans do not like to wait to long, before getting a response to their action. Usually we don’t want to wait more than 2-3 seconds for a response. We must asume that our application must do other things than just the concatenation. That properly means that the concatenation process maximum can take between 500ms and 1000ms. Assuming that we use normal concatenation – how many concatenations can be made within that timespan?

    String length / Runtime

    100ms

    250ms

    500ms

    1000ms

    50

    1252

    1815

    2456

    3419

    100

    482

    1228

    1670

    2413

    200

    565

    866

    1206

    4689

    400

    387

    606

    827

    1178

     

    This again shows that the length of the strings that you’re concatenating is extremely important in regards to the performance.

    To sum up my advice would be to always use the StringBuilder class. It might require 2-3 extra lines of code, but there’s no measureable penalty and you will ensure that your application is extremely scalable. The way I see it there’s no reason not to use the StringBuilder. Better safe, than sorry…always use the StringBuilder people!

    What are your thoughts on the subject?

    Is there any aspects that I’ve missed in my tests?

    Feel free to comment!

     

    As a part of my morning routine I read a lot of news pages on different subjects that interest me. But I also read a lot different techincal blogs. I think that it is a great way to get new input as a developer and supports my own development as a "code monkey". This small post is just link to some of the blogs that I find interesting.

    Janko at warp speed is definitely one of my favorite blogs. Janko has a lot of interesting blog posts about development, design and UI. His posts are always extremly well written, entertaining and very informative.

    Scott Gu's blog is also a great blog to follow. Scott Guthrie is the Corporate Vice President, .NET Developer Platform and blogs about alot of the new stuff that Microsoft is coming up with. Very interesting stuff if you like to follow the newest .NET technology.

    The guys behind the Stack Overflow podcast, Joel Spolsky and Jeff Atwood, both have some great blogs. Especially Jeff Atwood has some great and very entertaining posts. Joel does not post very frequently anymore and stated, in a recent Stack Overflow podcast, that he's planning to stop the blog soon.

    Scott Hanselman's blog is also great fun to follow. His posts are both on development but also about all kind of other nerdy stuff.

    Justin Etheredge has a blog called CodeThinked. His blog posts are very, very well written and is about different kind of development aspects - often about .NET.

    Dave Ward's blog, Encosia, is also a very interesting blog. Unfortunatly Dave Ward doesn't post very often - usually only once or twice per month.

    I also subscribe to some news aggregating feeds:

    DZone

    DotNetShoutout

    Smashing Magazine

    ASP.NET Daily Articles

    Go check some of these blogs and feeds out. I hope you enjoy them as much as I do.