Retrieving Information from Active Directory with SSIS

I was recently asked to help someone retrieve information from Active Directory for use in SSIS. My initial thought was “This shouldn’t be too difficult. I know there is an OLE DB driver for Active Directory, so it should just be a matter of hooking up a few components.” As it turns out, it took a few hours of painful trial and error to get something working. Also, at the end of it all, I found a different approach that was as simple as I had originally envisioned it. Hopefully, this will save some time for the next person needing to do this.
I’ll start with a summary of the results, in case you don’t feel like wading through the rest. It does work, but there are a few caveats. You can use either an OLE DB connection manager, or an ADO.NET connection manager with the .NET OLE DB wrapper. I recommend the ADO.NET connection manager, as it will allow you to use access Active Directory as a source in your data flows. If you are using the OLE DB connection manager, you will be limited to using Execute SQL tasks and passing the recordset into the data flow. There is a sample package attached that shows both approaches.
I started out using an OLE DB connection manager and choose the OLE DB Provider for Microsoft Directory Services. I put in the server name, left the authentication as Windows Integrated, and clicked Test Connection. The test was successful, so I moved on to the next step.

I attempted to set up an OLE DB data source in my data flow, using my OLE DB connection manager. However, nothing worked. So I backed up, and tried an Execute SQL task. Still nothing. My first thought was that I didn’t have the syntax right for the query, but after trying it successfully outside of SSIS, I was looking at the connection again. Eventually I found a reference in MSDN to setting the ADSI Flag to 1 on the connection properties. After updating that, my Execute SQL task started working.

 The Execute SQL was set up to return the full resultset to an object variable. The query used was “SELECT cn FROM ‘LDAP://DC=DOMAINCONTROLLER,DC=local’ WHERE objectClass=’User’”. This retrieves all the users from Active Directory.

I then went back to using an OLE DB Source in the data flow, but it still wouldn’t work. The OLE DB Source didn’t seem like it was able to execute the query or get any metadata from the provider. So I took the approach of using a script source to process the recordset in the data flow.  (See this post for more information on handling recordsets in scripts.) That worked fine, but I was wasn’t completely satisfied.
I went back and tried creating a new connection manager. This time I used the ADO.NET connection manager, but still used the OLE DB Provider for Microsoft Directory Services through the .NET Provider for OLE DB. Just for grins, I didn’t bother setting the ADSI Flag on this connection.

I then created an ADO.NET Reader data source in my data flow, set the connection to the connection manager I just created and put the same LDAP query in it.

When I ran the updated package, it worked first time through (without the ADSI Flag=1). Since it’s been my understanding that it is always preferable to use the OLE DB connection manager when working with OLE DB providers, this behavior was a little frustrating. It seems like you should get the same behavior whether you are using the OLE DB connection manager or the ADO.NET connection manager with the OLE DB wrapper.
Anyway, that was my journey through SSIS with Active Directory. I’d definitely recommend using the ADO.NET connection manager, as it make the process much easier. Please check out the sample package attached if you have further questions.

9 Comments

  1. Man says:

    hi,

    I tried this , but was getting an error “Query failed to parse.Execption from HRESULT: 0x80040e14.” Could you pls help me on this?

    Thanks
    Man.

  2. jwelch says:

    It sounds like the query you are using is not valid. Have you check it in another tool to validate that it is correct?

  3. http:// says:

    Hi,

    Thanks for this article. I was able to connect to an LDAP

    -in the Control Flow using Execute SQL Task and a Connection Manager using OLE DB,

    -in the Data Flow using Datareader and a Connection Manager using ADO.Net

    I now have an issue with LDAP’s configuration : for performance issue it cannot send more that 1000 results for a query. Looping on alphabetical is not a good solution either.

    a+,=)
    -=Clement=-

  4. http:// says:

    This was very helpful, but I am having an issue with the number of rows returned.

    What setting controls the number of rows returned from LDAP? Is there a property I can set — maybe on the ConnectionString to return all rows? My result set is getting limited to 4000. I was thinking it was Page Size, but I must not have the syntax right or the wrong property.

    Thanks!

  5. jwelch says:

    There is a cap enforced by AD on how many rows are returned. It sounds like you are hitting that. I’ll have to do some looking to see if there is a work around.

  6. Matt says:

    There should be a PageSize property that defaults to 1000.

  7. Tobi says:

    The default setting for Active Directory is to return 1000 objects. This can be modified by changing the registry on the search client or using Group Policy. I ran into the same issue a little less than a year ago and found this article on Microsoft TechNet(http://technet.microsoft.com/en-us/library/cc755809(WS.10).aspx) which helped me alot. I usually modify the registry on the search client and do not have any problems.

  8. Mike says:

    Great blow by blow. I’m using 2008R2 and getting a:
    “[ADO NET Source [85]] Warning: The data type “System.Object” found on column “cn” is not supported for the component “ADO NET Source” (85). This column will be converted to DT_NTEXT.” error when I run it.
    The connection tests fine and I can see data in preview but it fails on run. Note microsoft says this about the data reader sources in 2008.
    “Katmai now has an ADO.Net source (replacing the DataReader source) and destination adapters. They function a lot like the OLEDB ones, supporting a custom UI, drop down list of tables/views, and query builder. Its properties are also expression-able. This means we now support ODBC as a destination, as well. ”

    Any thoughts?

    • jwelch says:

      Well, it’s a warning not an error – so have you tried running the package and seeing if it completes? It sounds like the ADO.NET Source doesn’t understand the column type of the cn column. That may or may not be a critical problem. If it comes in as NTEXT, you might be able to just cast it to a different type using a Derived Column.

Leave a Reply