Tip: Enabling ‘Add Mirror’ with Windows 7 Software RAID

By erik at December 30, 2010 01:19
Filed Under: General

A quick tip for those who might be moving from a mobo-enabled hardware RAID approach to a software RAID approach as enabled by Windows 7.  The particular use case I’ll describe here is for RAID 1 (Mirroring).  A typical scenario could be:

  1. You’ve got a mobo that has a hardware feature for enabling SATA (or other) RAID.  You had decided to use RAID Level 1 (mirroring) to enable redundancy of data across disks using this mobo-based embedded RAID.  You believed that this would provide to you the ability to recover quickly in the event of a single disk loss.  Note: RAID 1 is NOT a backup solution.
  2. You’ve had a RAID warning on one of the disks.  You reboot and discover that the assigned array drive doesn’t contain all of your recent work.  This could be because of the hardware vendor’s implementation of write behavior, caching, etc. You disable RAID completely in the BIOS (or via driver configuration), and enable the other drive (mount it online) as your former RAID volume letter, and discover your data is there.  Maybe it was just a mobo driver problem?  Likely.  I’ve wrestled with mobo-enabled RAID solutions for years now only to finally realize they just don’t work as advertised.  If you need real hardware RAID, go to Adaptec.
  3. In a sigh of relief upon booting your computer and seeing your data, you decide that you should now really understand how to recover the mirror in the event of various types of failures, and start researching what really can happen when an embedded mobo RAID solution blows up.  You essentially discover you have to have a replacement mobo and drivers handy.
  4. Realizing this might not be realistic, you remember that Windows has software RAID features.  You start researching, and find out that in the event of a mobo failure, you can (in theory) take your Windows software RAID drives to a replaced machine.  You also learn that software RAID has improved significantly recently and you may only suffer a very small penalty on write operations, which are a minority of the operations for most non-server applications.
  5. You fire up Windows (in my case, Windows 7 Ultimate 64-bit), and launch Disk Management.  What next?

At this point, assuming there aren’t actually any hardware issues with your drives, you can ensure that your good drive is mounted as your former RAID volume letter.  Take the second drive and wipe it, but only if you have backups of your data!

Now, you go to mirror the second drive with the first (that has your data).  How to do this?  In Disk Management, you right-click on the drive (in the right-hand space representation) of the drive that has your data, and click ‘Add Mirror’, like so:

image

Except, you discover, as is highlighted above, that Add Mirror is greyed out, and disabled.  What gives!?

The solution was not trivial, but worked for me:

  • You must ensure that your secondary drive, the one you’ll be creating a mirror of your first drive with, is ‘Unallocated’ (Delete Volume).
  • If that doesn’t work, then perform a ‘Shrink Volume’ command on the drive that contains your data.  Choose a very small shrink amount, such as 512MB.  You should have much more than this free.  This operation may take awhile.  Go grab a beer – or a scotch, or three.
  • When that operation is complete, you’ll be left with 512MB of unallocated space.  This of course is simply annoying, I want to be able to use all the space I have!  So…
  • Now perform an ‘Extend Volume’ command on the drive that contains your data.  Extend it by the maximum allowable amount.

Voilà!  Now right-clicking the drive that has your data on it reveals the ‘Add Mirror’ command to be enabled for you to extend your existing drive onto the second drive, as a RAID 1 Mirror.  This will invoke a resynchronization, which will take some time.  Grab more beer/scotch, or go to bed and let the drives synch overnight.

Please, one caveat: YMMV on this.  There are simply too many different drive combinations, OS versions, etc. for me to comment on all of them.  However, this worked out nicely for me, I hope this helps someone! 

Happy New Year 2011!

Could it really be? Can it? 64-bit Flash for IE 64? Oh MY!

By erik at November 08, 2010 22:51
Filed Under: General

After arguing with an Adobe person earlier this year about, among other things, the proprietary nature of ActionScript given the re-invigoration of the last few years of interest in the ECMAScript standard, I was sure to get my plug in for a 64-bit version of the popular Flash player browser ‘plugin’ !  I’m quite certain they weren’t listening to me, but I was happy Smile to recently discover the preview here.  Works well for me, even though it is a preview version.

I can’t help but wonder what took so long?  Then, I happened to catch this today (though its been posted for a bit).  Maybe an acquisition by the maker of the world’s most popular browser was the impetus!

Update: The SUCK doesn’t stop on this one: Silverlight doesn’t, repeat DOES NOT run in IE 64!  Steaming mad  See:

image

Update 2: Double SUCK!  Apparently my original link to the Adobe labs download for the 64-bit Flash plugin no longer works (as of 1/1/2011).  Maybe they’ve taken it down!  Ugh.  Well, I’ve got it installed anyway:

image

Tip: Get back your System.Linq; er, reference to System.Core in VS2010

By erik at October 05, 2010 01:39
Filed Under: .NET, Software Development

If you’re like I am, then you can’t stand all the garbage references and using statements in some of the VS Project and Item templates, and instantly remove many of those references.  I talk myself into doing that by reasoning that if I need an assembly reference – that I want to overtly make the choice myself, so that I can think about what the dependency on that component assembly is providing to my current layer.

Well, tonight I got burned by removing references from project templates to System.Core, and finding out that in fact when you try to re-reference this assembly, it still shows in VS2010 as selected, as such:

Solution references showing no reference to System.Core:

image

Note the Add Reference Dialog showing an existing reference to System.Core:

image

What gives?  Well, I blogged this because the solution was not quite straightforward, but trivial when you think about it.  I simply opened up my .csproj file and added the reference in manually using Notepad2, and reloading the project when prompted to do so by VS2010.  Here’s how to add the reference in:

  <ItemGroup>
    <Reference Include="System.Core" />
  </ItemGroup>

Note that the <Reference /> node can be included in another <ItemGroup /> if one exists.

By the way, if you’re wondering about the reference to System.Linq: well – it is in System.Core.dll.  That’s the connection…

Hope this helps someone.

Tip: Limit coverage analysis when using NAnt, NCover

By erik at February 23, 2010 23:19
Filed Under: .NET, Software Development

I was having some difficulty the other day getting code coverage analysis to run on just the selected assemblies that I wanted in my build output directory, and had the requirement (placed on myself) that I didn’t want to list the assemblies declaratively in my build configuration file.  Not finding a quick solution, I thought I’d share what I came up with – it may end up being helpful to someone out there.  Please do let me know if you have another solution, I’d be interested in hearing about it.

As a background, I’m using the Gallio suite (including MbUnit) for testing which includes the Gallio test runner and custom Gallio NAnt task.  I want to limit the list of covered assemblies to those meeting a set of criteria, not all assemblies in a given folder.  The solution I came up with involves writing a custom NAnt function, which turns out to be a trivial thing to do.

First, a portion of my Gallio NAnt task:

    <gallio runner-type="NCover" 
            working-directory="${build.output.dir}"
            report-types="html"
            report-directory="${test.output.dir}"
            show-reports="false"
            failonerror="true"
            verbosity="Normal"
            echo-results="true">
      <runner-property value="NCoverArguments='//q //a ${coverage.assemblies}'"/>

Note the <runner-property /> element with NCoverArguments specified in the value attribute.  In this attribute, I specify a list of NCover runtime arguments, including the //a argument used to specify the list of assemblies to be profiled delimited by semi-colons.  The trick for me was to set this list dynamically.  I also, quite frankly, didn’t want to get into writing a ton of NAnt code using things like string::concat functions and whatnot.  Before I call this <gallio /> task in the appropriate target, I set a property like so:

    <property name="coverage.assemblies" 
              value="${mlincek::getcoverageassemblies(assembly.output.dir)}"/>

Essentially, here, I have set the list of coverage assemblies (property/variable ‘coverage.assemblies’) by calling out to a custom-written NAnt function that I have written, and passing in my build output directory as a string.  Note this function returns a string:

  <!-- Custom NAnt functions -->
  <script language="C#" prefix="mlincek">
    <code>
      <![CDATA[
        [Function("getcoverageassemblies")]
        public static string GetCoverageAssemblies(string assemblyFolder)
        {
          StringBuilder sb = new StringBuilder();
          FileInfo[] files = (new DirectoryInfo(assemblyFolder)).GetFiles();
          foreach (FileInfo file in files)
          {
            string fileName = file.Name;
            if (!fileName.EndsWith(".Tests.dll") && !fileName.EndsWith(".pdb"))
            {
              int dllPosition = fileName.LastIndexOf(".dll");
              sb.Append(fileName.Substring(0, dllPosition) + ";");
            }
          }
          return sb.ToString();
        }
      ]]>
    </code>
  </script>

You can place this script task at the bottom of your NAnt .build file, or do as I do and place it into a common.build file which you can reference via a NAnt <include /> task.  Of course, this function could be made more flexible and used for other things besides setting the coverage assembly list simply by passing in the filter criteria either as a string parameter, an expression, or a collection of rules (rule objects), but hey, this is a build file for gosh sakes and it got the job done for me.

Now, I can add assemblies to my overall solution and as long as I follow my naming conventions for projects and build folder structure, they will be picked up automatically for coverage analysis, per my filter criteria.

Hope this helps someone!

Good Database Design, Revisited

By erik at February 23, 2010 14:25
Filed Under: Architecture, Design

Once again, I find myself feeling bad for not writing recently.  I guess the duties of family, job and the holidays got the best of me these last many months.  I hope to write more frequently in the next several weeks and months…

Anyway, I came across an article related to one of my initial posts that I felt compelled to share, with the hope that you will read it here.  In this article entitled Seven Deadly Sins of Database Design, Jason Tiret of Embarcadero Technologies cites seven ‘deadly’ sins that one can commit when designing a physical data model (many of these points also apply to conceptual/logical models, too):

  1. Poor or missing documentation for database(s) in production
  2. Little or no normalization
  3. Not treating the data model like a living, breathing organism
  4. Improper storage of reference data
  5. Not using foreign keys or check constraints
  6. Not using domains and naming standards
  7. Not choosing primary keys properly

I think Jason has nailed many of the additional core issues with planning and designing physical data model implementations, above and beyond my initial post on design and normalization.  Thanks for the great work, Jason!

Finally, Sweet Validation…

By erik at October 13, 2009 00:40
Filed Under: Architecture, Design, Software Development

Hello all!  Sorry I have been away for several months – it has been a busy spring, summer and now fall here in northeast Ohio.  I thought I’d drop a quick post today highlighting a great piece of research that I just finished reading.

Those of you who know me know what a great fan I am of Conway’s Law.  Having worked for and consulted for many organizations over my career thus far, I have found Conway’s Law to be a solid, immutable truth.  For those of you not familiar with Conway’s Law, it essentially states that:

…organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations.

Since software development is about expressing and constructing a solution to solve a problem, the pattern of communication about the problem domain and the vocabulary used to describe it becomes paramount.  In addition, the manner in which this pattern of communication takes place and evolves – ultimately – influences solution design, construction and implementation.  I guess I like to say that good, well-run organizations produce good software.  And, that… …well, you get the picture.  Note that in my experience the word “organization” here applies to both the technology department and the business people in the organization (can’t have one without the other).

Though there has been additional research on Conway’s Law since Melvin Conway first introduced it in 1968, empirical evidence has seemingly been hard to come by.  Thus, often (at least from my perspective), Conway’s Law can end up relegated to the subject of discreet giggling around the water coolers and hallways of dysfunctional organizations. Wink

How fitting is it then that this study, in the midst of the unit test code coverage quality debate, actually ties organizational metrics to the quality of software produced!?  I say, quite fitting.  Take a moment and read it.  I believe it will validate all that you’ve likely come to know and expected about the effects of “the organization” on producing software.

Non-softies take note: You just might even find it humorous that the guinea pig used to exercise the metrics was none other than Vista!

Cheers!

TheInfluenceOfOrgStructureOnSoftwareQuality.pdf (172.27 kb)

Dino Esposito on dotnetrocks.com

By erik at June 30, 2009 00:36
Filed Under: .NET, Architecture

Per my last post on Some Interesting Architecture Reading, I wanted to note that one of the authors of the book Microsoft .NET: Architecting Applications for the Enterprise, Dino Esposito, was recently over on dotnetrocks.com doing a show with Carl Franklin and Richard Campbell.  He mentions that book which I had recently reviewed in my blog, and of course has a nice interview with Carl and Richard on interesting things architectural.

By the way, if you’ve never checked out dotnetrocks.com, it’s a great way to catch up on trends occurring in the .Net stack when you’re driving or have other time when you can listen to an audio podcast.

Some Interesting Architecture Reading

By erik at May 28, 2009 20:47
Filed Under: Architecture, Design, .NET

This past winter I completed a well-done book by Dino Esposito and Andrea Saltarello entitled Microsoft .NET: Architecting Applications for the Enterprise.  I’d like to give my endorsement to this well-written, well-thought-out book.  It encapsulates all of the core concepts needed to correctly build an enterprise-class, durable, yet flexible application.  In many cases, it also gives these concepts detailed treatment, as well.

What I like most about this book is that, although it’s got “.NET” in the title – and it occasionally touts Microsoft-specific technologies, most of book content is dedicated on traditional application architecture and design concepts that are time-tested, proven, and have developed outside of the .NET community.  This includes concepts such as UML, layered architecture and the use of design patterns.  This is a good thing.

Although the book is not a replacement for the core application design and architecture books that I consider mandatory reading from authors such as Booch, GoF, and Fowler (more on this in a future post), it is a great book to dive into that you can consider a comprehensive and detailed treatment of application architecture.  Check it out.

Additionally, while listening to a podcast today, I heard about App Arch Guide 2.0 from the Microsoft Patterns & Practices team.  While I have not read this book, I can say that the interview with one of the authors was compelling enough for me to download it.  More to come on this one.

Tip: Remove Stack Trace Information in ASMX (XML) Web Services

By erik at May 26, 2009 13:36
Filed Under: .NET, Architecture

A non-obvious way to alter the behavior of .NET automatically populating Soap fault elements with stack trace information (specifically, in the faultString element) is to use the customErrors configuration element, which is part of the ASP.NET Settings Schema.

I state this is non-obvious because in the documentation, the following statement is made here under the Remarks section:

The <customErrors> element does not apply to errors that occur in XML Web services.

However, the use of this element’s mode attribute clearly does alter the behavior, when setting the mode attribute to “On”, in the system.web configuration section:

<configuration>
    <system.web>
        <customErrors mode="On" />
    </system.web>
</configuration>

Note that the default setting for the customErrors element’s mode attribute is “RemoteOnly” which will add stack trace information for local requests and remove it for remote requests.  This is usually desirable behavior but depending upon your own needs it is good to know the availability of the mode attribute’s “On” setting to always, universally remove the stack trace information, should you desire to do so.

A Simple Case For Good Database Design

By erik at May 08, 2009 22:52
Filed Under: Architecture, Design

So I have been involved in a few endeavors lately that have made me realize something that I often do second nature and take for granted -- good database design.  I don't know what I've thought throughout my career thus far, but I guess I just always figured that everyone thought like I did and ensured the delivery of a well-designed, robust, database model as persistence for their software.  Apparently I figured wrong.

I am running into situations where a developer had both the complete opportunity and full control to design a robust data model, but didn't.  Why?  Well, we could get into an interesting discussion on top-down design vs. bottom up design, ORMs, and the like.  However, that is a subject for a future post (er, perhaps many future posts).  We could also take into consideration performance implications of certain database designs versus others, more on this in a bit, below.  My observations have led me to the assertion that, for whatever the reason is, sometimes there are developers who just don't do good database designs.  In line-of-business apps, and especially those that are enterprise-class, and of strategic importance, perhaps nothing is more important than a good database design.  In this post, I will explain why this is with a simple example:

Imagine a case where you are designing perhaps a supply chain management system and you need to cache supplier part information in your database for performance reasons (as opposed to querying the supplier's remote data repository directly).  You start building your app by getting an Excel sheet of suppliers from your vendor management folks who work with the suppliers.  It has columns like so:

Supplier Name, Supplier Address, Supplier CSZ, Supplier's Item #, Supplier's Price

It certainly might be tempting to upload / import this data right into a single database table as is and begin working with it.  However, there are clearly problems with doing this:

  1. Redundancy of Data.  Note that for every row (sometimes, in database theory called a "tuple") that contains a different supplier item, the supplier's address repeats.  This creates duplicates in terms of the name, address and CSZ fields.
     
  2. Brittle Data Consistency.  As a direct consequence of the repeated address, it is conceivably possible that should the supplier's address change, an update is performed inconsistently (perhaps only to 1 row).  This would result in duplicate addresses for the supplier -- we physically have two addresses for the supplier when we know logically we should only have one (let's pretend this isn't a multi-site supplier).  In database theory you'll sometimes hear this problem described as "update anomalies".
     
  3. Insertion Anomalies.  Interestingly, if you think about it you might note that if we make the table field Supplier's Item # and Supplier's Price required (NOT NULL), then we can't record the address of a supplier if that supplier doesn't currently supply us with at least one item.  I suppose we could be stuff garbage in the fields, but that just devalues our database and our system.  We could make the item number and price fields non-required (NULL), so we could store a row with the address.  But, would we remember to delete the "hacked" address row when the supplier finally starts supplying us with items?  And, if we do this, hopefully you can see that we can easily run into a problem if data gets into the database with a item number specified but no price, or vice versa.
     
  4. Deletion Anomalies.  Imagine what happens now when we decide that we need to delete all rows containing supplier item and price information, because the supplier no longer supplies us with items?  What happens to the supplier's address information?

It is easy to see why we need good, well-designed database models from this example.  I've only scratched the surface but if you're interested pick up a classic text on relational theory.  So now what?

The way to arrive at a good design with our example is to decompose (break down, split apart) the Excel sheet into a set of multiple tables, like so:

Table: Suppliers (Columns: SupplierKey, SupplierName, Address, City, State, Zip)
Table: Items (Columns: ItemNumber, SupplierKey, Price)

I've done a few things here.  First, I've made a decision to store the data in two tables in my design.  The first table is called "Suppliers".  I've also added a contrived key to the Suppliers table, called "SupplierKey".  A contrived key is a key you make up, or generate, (or have generated for you), when a good natural key is not present.  A natural key is a semantically clear identifier for an entity being represented by the table.  So, an example might be the state driver's license bureau where information about you is kept in a Driver table using your driver's license number as the natural key.  In this example, I've created a contrived key because names, like our supplier name, typically don't make good key fields.

Second, I broke out the City, State, and Zip fields.

Finally, I have an Items table, and have moved the ItemNumber and Price fields to it.  ItemNumber in this case is a good natural key.  I also added what is known as a foreign key field, named SupplierKey.  The SupplierKey foreign key field in table Items is the same SupplierKey field in table Suppliers (same data type).  This will be used to relate the two tables together to perform queries.

The resultant design now solves all four problems mentioned above.  The process of decomposition in database design is called "normalization" and we say that each table in a database design or model is in a state of one of the "normal forms".  The basic idea behind normalization is to identify your entities (typically nouns in your problem domain), and create tables for each.  Then, you identify attributes for your entities, and create columns for each.  You want to validate that each of your entity attributes are facts about and depend upon the entity, and not other attributes or combinations of other attributes.  The process is rounded out by creating links or relationships between entities.  A full and appropriate treatment of normalization is outside the scope of what I wanted to do here, not to mention it's been done by others many times over.  Aside from grabbing a hold of any of E.F. Codd's work (E.F. Codd is considered the "father of normalization"), perhaps one of the early and great articles on normalization was written by William Kent, entitled "A Simple Guide to Five Normal Forms in Relational Database Theory", and located here.  A good summary and jump page to check out as well would be wikipedia, also.  Thus, if you've never taken the time to study normalization, I can assure you that now is the time.  You will reap great benefit from the effort.  Normalization: live it, learn it, love it. Smile

So, as mentioned earlier, what if you fully understand the process of normalization, the normal forms, etc., and choose not to normalize (or use a relatively denormalized design) because of performance reasons?  And just exactly what are the performance concerns?  Classic database and relational theory taught that when you write database operations via the database engine's DML (SQL) and had to join tables, that the joining of tables was an expensive operation.  It is my belief that we have progressed, especially with innovative database management systems such as Microsoft SQL Server, well past this and it isn't nearly as true today as it once was when initially proclaimed.  The benefits of normalization for both transactional and data warehousing scenarios typically outweigh any [perceived] performance gains.  I found a nice point on this topic on this IBM treatment of normalization:

In practice, many databases are de-normalized to greater or lesser degree. The reason most often stated has to do with performance -- a de-normalized database may require fewer joins and can, therefore, be faster for retrievals.

While this reasoning may be true, the usual caveats against premature optimization apply here as well as everywhere else. First, you should determine sufficiently that a performance problem exists and that the proposed de-normalization improves it before introducing a conceptually suboptimal design.

Furthermore, a de-normalized schema can be harder to update. The additional integrity checks that are necessary in this case may offset the performance gains for queries obtained through denormalization.

And another nice point, here:

Well normalized data makes programming (relatively) easy, and works very well in multi-platform, enterprise wide environments. Non-normalized data leads to heartbreak.

In summary, I hoped I've inspired some of you to think about your database model design more thoroughly, and to check out the topic of database normalization.  Until next time...

 

Visitors

Disclaimer

Views expressed are my own and in no way represent those of my employer; this is in no way a work or work-related blog.  All postings are presented "AS IS" and confer no rights.

Copyright

All content copyright © 2009-2011 by Erik Mlincek.  You may not use content without the express written consent of the author, however you may link back to blog topics hosted at mlincek.com without obtaining permission in advance.