Validate SSIS package on server

SSIS server validation



  • Any change in database schema will break the associated SSIS package.
  • Generally each change is responsible to identify dependent components to be included in the change scope.
  • But what about any missed dependencies. E.g. Recent production issue.
  • Such broken SSIS can be identified in Pre-Prod only if pre-prod is well controlled and identical copy of prod environment.
  • Process is required to identify packages impacted by dependencies changes like database schema.
  • Proactive validation of SSIS package could help to identify the issue and reduce the system downtime.

High level architecture



The solution is capable of validating all ssis packages for sql server version 2008 on-wards. Instead of publishing the full source code below is the pseudo code and sql scripts to validate the packages.

  1. Create C# console base project which takes the server name you want to validate. The app calls the web service to validate the SQL server (validate SSIS packages on that server).
  2. Create web api project and expose following methods (Job Controller)
    1. Validate – start the validation process by creating SQL agent job dynamically on target server
    2. ValidationFinished – this will be invoked by job when validation is finished.
    3. GetValidationStatus – this will provide the SQL agent job status currently validating server

The Validate method works as follows..

  • Get the SQL Server version
private Version GetServerVersion(string serverName)
 var connectionString = string.Format("Server={0};Database=master;Trusted_Connection=True;", serverName);
 using (var connection = new SqlConnection(connectionString))
 return new Version(connection.ServerVersion);
  • Create interface IProcessor and implement it for sql server 2008 and 2012 as they both have different way to validate package
  • Based on server version get the eligible concrete implementation.
  • The validation engine will create the validation job on target sql server. for each proxy account, the script will create job step to trigger a network share package running under proxy account. (for details see createValidationJob.sql)
  • Each validation step gets the package scheduled under the current account and validate the package, populate the results back on central server. and in the end extract the job history and delete the job.)
  • The network deployed ssis package uses same engine to validate 2008 and 2012 via IProcessor since validation is done differently for both servers.

The common scripts used for both servers are here (download and rename doc to zip)


The SQL server 2008 validation is done via DTEXEC with /validate command (download and rename doc to zip)

SQl 2008-Scripts

The SQL 2012 validation is done via EXECUTE SSISDB.catalog.validate_project stored proc. (download and rename doc to zip)

SQL 2012 Scripts

The centralized server which keeps the status of SSIS validation looks like


The font end to display the data was build on angular 2 API..

ViewModel First – ViewCaching

With View model first approach, view are usually created via DataTemplate, and each time a view model is injected in the content control, the corresponding view is recreated.

If you have complex view where it take bit time to create view, you may see the performance hits. To avoid the performance hit you may want to cache the view and use the cached view when available. This can be done via creating CacheContentControl and you choice of view factory.

The basic idea is to wrap the view inside CacheContentControl and delegate the view creation logic to your view factory. The attached sample uses weakreference view factory,say if GC has not collected the view, you may use the cached view instead of creating view each time. This will work with controls like tab control docking controls etc.

The CacheContentControl  code

public class CacheContentControl : ContentControl
        public CacheContentControl()
            Unloaded += ViewCache_Unloaded;
            ViewFactory = WeakReferenceViewFactory.Instance;

        void ViewCache_Unloaded(object sender, RoutedEventArgs e)
            Content = null;

        private Type _contentType;
        public Type ContentType
            get { return _contentType; }
                _contentType = value;
               //  use you favorite factory
                Content = ViewFactory.GetView(value);

        public IViewFactory ViewFactory { get; private set; }

In the DataTemplate, use the ViewCache, pass the type of the real view you want to use:

<DataTemplate DataType=”{x:Type moduleB:Panel3Vm}”>

<Border Background=”Green”>

<local:CacheContentControl ContentType=”{x:Type moduleB:Pane3 }” Margin=”5″/>



The example uses standard data template (red option) to demonstrate each time you navigate to panes the corresponding views are recreated with new hash code.

If you select CacheContentControl (green option) and navigate the view is only created once. If you click on button to collected the GC then view are created on navigation if they are claimed by GC.

Why ViewModel first ?

I prefer to use view model first approach. For many reasons:

  • Vms are your application containing most of logic apart from glue code in form of behaviors or triggers.
  • If you creates views then you are responsible for its life and cleanup code. You have to deal with threading and other issues which are difficult to test. On the other hand of you create vms and leave the view creation logic with WPF via data template.. you don’t have to worry about threading issue. And there will be better separation of concerns.
  • With vm first approach zero code behind.
  • With a project level isolation for view and vms you can restrict developers using view specific things like dispatcher in the view model leaving more cleaner and testable code base. I.e view project sprojec to vm. And vm project should not refer to any presentation lib.
  • If there is clear boundary between view and vm. Both can evolve and will be less fragile.
  • Allows more complete testing of logic to open new Views and ViewModels
  • Tends to be DRYer (don’t repeat yourself ) as applications get larger
  • View and ViewModel are more independent and can be worked on separately more easily.

For more details see attached source code (download and rename docx to zip)

ViewCache Source Code

TPL Dataflow – Concurrent Programming

TPL DataFlow

TPL Dataflow is an in-process actor library on top of the Task Parallel Library enabling more robust concurrent programming.

Parallel computing is a form of computation in which multiple operations are carried out simultaneously.Parallel computing is closely related to asynchronous programming, using many of the same core concepts and support. Asynchronous programming is an approach to writing code that involves invoking operations such that they don’t block the current thread of execution.Many personal computers and workstations have two or four or 8 cores (that is, CPUs) that enable multiple threads to be executed simultaneously. Computers in the near future are expected to have significantly more cores. To take advantage of the hardware of today and tomorrow, you can parallelize your code to distribute work across multiple processors. In the past, parallelization required low-level manipulation of threads and locks.

The purpose of the TPL is to make developers more productive by simplifying the process of adding parallelism and concurrency to applications. The TPL scales the degree of concurrency dynamically to most efficiently use all the processors that are available. In addition, the TPL handles the partitioning of the work, the scheduling of threads on the ThreadPool, cancellation support, state management, and other low-level details. By using TPL, you can maximize the performance of your code while focusing on the work that your program is designed to accomplish.

Data parallelism refers to scenarios in which the same operation is performed concurrently (that is, in parallel) on elements in a source collection or array. In data parallel operations, the source collection is partitioned so that multiple threads can operate on different segments concurrently.

The Task Parallel Library (TPL) is based on the concept of a task, which represents an asynchronous operation. In some ways, a task resembles a thread or ThreadPool work item, but at a higher level of abstraction. The term task parallelism refers to one or more independent tasks running concurrently. Tasks provide two primary benefits:More efficient and more scalable use of system resources & More programmatic control than is possible with a thread or work item.

The concurrency models we has discussed so far have the notion of shared state (data) in common.Shared state can be accessed by multiple threads at the same time and must be thus protected, either by locking or by using transactions. Both, mutability and sharing of state are not just inherent for these models, they are also inherent for the complexities.Unfortunately, programmers have found it very difficult to reliably build robust multi-threaded applications using the shared data and locks model, especially as applications grow in size and complexity.Making things worse, testing is not reliable with multi-threaded code. Since threads are non-deterministic, you might successfully test a program one thousand times, yet still the program could go wrong the first time it runs on a customer’s machine.

We now have a look at an entirely different approach that bans the notion of shared state altogether. State is still mutable, however it is exclusively coupled to single entities that are allowed to alter it, so-called actors.The actor model in computer science is a mathematical model of concurrent computation that treats “actors” as the universal primitives of concurrent digital computation: in response to a message that it receives, an actor can make local decisions, create more actors, send more messages, and determine how to respond to the next message received.For communication, the actor model uses asynchronous message passing. In particular, it does not use any intermediate entities such as channels. Instead, each actor possesses a mailbox and can be addressed. These addresses are not to be confused with identities, and each actor can have no, one or multiple addresses. When an actor sends a message, it must know the address of the recipient. In addition, actors are allowed to send messages to themselves, which they will receive and handle later in a future step.

The Task Parallel Library (TPL) provides dataflow components to help increase the robustness of concurrency-enabled applications. These dataflow components are collectively referred to as the TPL Dataflow Library. This dataflow model promotes actor-based programming by providing in-process message passing for coarse-grained dataflow and pipelining tasks.The TPL Dataflow Library provides a foundation for message passing and parallelizing CPU-intensive and I/O-intensive applications that have high throughput and low latency. It also gives you explicit control over how data is buffered and moves around the system.

If you want to scale your application beyond single machine or process then ServiceBus (NServiceBus, Microsoft Azure, etc) are the best candidates. These are designed around message oriented architecture and you can achieve highly reliable, available and scalable application. As an architect i always focus on reliability, after all, a highly available and scalable service that produces unreliable results isn’t very valuable. – We will need another post to cover the in-depth of service bus..

TPL Dataflow (TDF) is a library for building concurrent applications. It promotes actor/agent-oriented designs through primitives for in-process message passing, dataflow, and pipelining. I have been playing with dataflow since its CTP was released and i found its very use in cases where you have to process data in form of a pipeline.With just few in-build blocks you can easily and quickly build concurrent app..

The primitive blocks provided by dataflow are

  • Buffering Blocks – Holds data for use by data consumers.
    • BufferBock(T) – FIFO queue of message that can be written to multiple sources or read from by multiple targets.
    • BroadcastBlock(T) – Used when you must pass multiple messages to another component.
    • WriteOnceBlock(T) – similar to broadcastblock except object can be written to one time only
  • Execution Block – call a user provided delegate for each piece of received data
    • ActionBlock(t) – calls a delegate when it receives a data – excepts synchronous or asynchronous delegates
    • TransformBlock(Tinout,TOutput) – call function delegates to transform the incoming message to another type- excepts synchronous or asynchronous delegates
    • TransformManyBlock(TInput , TOutput) – similar to TransformBlock except it can produce zero or more output values for each input value, instead of only one output value for each input value. – excepts synchronous or asynchronous delegates

Degree of Parallelism

Every ActionBlock<TInput>, TransformBlock<TInput, TOutput>, and TransformManyBlock<TInput, TOutput> object buffers input messages until the block is ready to process them. By default, these classes process messages in the order in which they are received, one message at a time. You can also specify the degree of parallelism to enable ActionBlock<TInput>, TransformBlock<TInput, TOutput> and TransformManyBlock<TInput, TOutput> objects to process multiple messages concurrently.

Now lets look at the implementation details of a web crawler

Request Buffer
/ Save     ParseLink

The messages to download a Url is received in the request buffer which is downloaded by a TranformBlock and converted into type safe page type message. The page message is then broadcasted using Broadcast block, which is further received by save ActionBlock and ParseLink Block. The parse link block parses the urls in the page and if they belong to same page it will raise an event for each url. The consumer of engine will receive the url and if its a new URL it will be posted back to engine.. The save action block will save the page to disk.

This is very basic example but you can see the message based approach is much more simpler than a shared resource + threading approach.

The TPL dataflow is good in case you don’t want to scale the solution beyond single machine as it offer a in process message base approach.With a proper service bus like NServiceBus you can scale out the solution beyond single machine and multiple servers could process the request to achieve the high throughput and off-course with easy to build,maintain clean code base.

In production there are more things you have to take care like logging, error handling, transactions, unexpected failure recovery, dependency injection,loosely coupled components, extensibility, scalability and so on.. Frameworks like NServiceBus provides all these features alone with API to handle more complex business problems.

Download Code : Here (Partially finished but working POC)

Asp.Net MVC flavours

The main purpose of the MVC (Model, View and Controller) architecture is to make separation of the business layer (logic) and the application layer (data)  from the presentation layer to the user. Model-View-Controller is a software pattern for achieving isolation between different application components. Its always desirable for software applications (especially web-based applications) that there must be clear separation between business logic and the user interface. A model represents the state of a particular aspect of the application. A controller handles interactions and updates the model to reflect a change in state of the application, and then passes information to the view. A view accepts necessary information from the controller and renders a user interface to display that information. Over past few years many version of ASP.Net MVC were release moving towards more and more mature framework for web development.Below is brief comparison of features release in each version

MVC 5 – October 2013

  • Bootstrap replaced the default MVC template.
  • ASP.NET Identity for authentication and identity management.
  • Authentication Filters for authenticating user by custom or third-party authentication provider.
  • With the help of Filter overrides, we can now override filters on a method or controller.
  • Attribute Routing is now integrated into MVC 5

MVC 4 – August 2012

  • ASP.NET Web API, a framework that simplifies the creation of HTTP services and serving a wide range of clients.
  • Follow to create your first ASP.NET Web API service
  • Adaptive rendering and other look-n-feel improvements to Default Project Templates.
  • A truly Empty Project Template.
  • MVC4 also uses Razor View Engine as a default view engine with some new features like condition attribute and ‘Tilde slash’
  • Based on jQuery Mobile, new Mobile Project Template introduced.
  • Support for adding controller to other project folders also.
  • Task Support for Asynchronous Controllers.
  • Controlling Bundling and Minification through web.config.
  • Support for OAuth and OpenID logins using DotNetOpenAuth library.
  • Support for Windows Azure SDK 1.6 and new releases.
  • MVC4 provides better support for Jquery like Jquery Mobile
  • Client side validation, Jquery validation and enhanced support for asynchronous methods
  • Supports many new features for mobile apps and also provides new mobile project template and default templates are refreshed and modernized

MVC 3 – January 2011

  • New Project Templates having support for HTML 5 and CSS 3.
  • Improved Model validation.
  • Razor View Engine (.cshtml for c# and .vbhtml for Visual Basic) introduced apart from Web Forms view engine (.aspx)
  • Having support for Multiple View Engines i.e. Web Forms view engine, Razor or open source.
  • Controller improvements like ViewBag property and ActionResults Types etc.
  • Unobtrusive JavaScript approach, Ajax and Client side Validation, Jquery Validation and JSON binding support
  • Chart, WebGrid, Crypto,WebImage, WebMail Controls
  • Improved Dependency Injection with new IDependencyResolver.It provides powerful hooks with Dependency Injection and Global Action Filters
  • Partial page output caching.
  • TempData, ViewData ,ViewBag
  • Supports not only Master Page but also Layout Page

MVC 2 – March 2010

  • MVC 2 uses only Web Forms view engine (.aspx) as a default View Engine.
  • (HTML Syntax) Web Forms view engine syntax: <%=Html code %>
  • TempData, ViewData
  • Jquery support is Good
  • Supports only Master Page
  • Client-side Validation and Asynchronous controllers
  • support controllers to process requests asynchronously
  • Supports validations using the RangeAttribute, RequiredAttribute, StringLengthAttribute, and RegexAttribute attributes.

MVC 1 – March 2009

MVC CTP – December 2007 

* Notes

  • View Engine is responsible for rendering of the HTML code from your views to the browser.
  • In ViewData, dictionary of objects are accessible via strings as keys
  • ViewBag was added in the C# 4.0 which uses the dynamic feature that allows to add properties of an object dynamically . We can say that ViewBag = ViewData + dynamic feature around the ViewData dictionary