Joining ADO.Net Tables

As you probably recall, ADO.Net uses a "disconnected" database model, meaning that it makes a temporary connection to a database to pull the data and all of the data is locally cached in RAM on the client PC. The DataSet object is, in many respects, a miniature in-memory relational database engine.

This might be a good time to review Using ADO.Net DataRelations

By using the built-in features of ADO.Net you can perform many database kinds of things like:

SQL Join syntax

Joining multiple Tables in a relational database (such as SQL Server) is done via an SQL statement such as:

SELECT Customers.CompanyName, Orders.OrderDate
FROM Customers
INNER JOIN
Orders
ON Customers.CustomerID = Orders.CustomerID

In this example, the Customers table is joined to the Orders table based upon a relationship between the two tables where the CustomerID is the same.

There is no native ADO.Net support for parsing a complex SQL statement like the one above, but you can write a class module that will take two DataTables and join them together in much the same way as it's done via an SQL statement.

The trick is to leverage the ADO.Net feature called DataRelations. The DataSet object can hold a collection of relationships between DataTables. After you have created a relationship, you can then use this relationship to get data from a related table.

Documentation Links

The Example Code

Let's look at a simple example of a LEFT join:

Dim dr_left, dr_right, dr_dest As DataRow
Dim dt_left, dt_right As DataTable
Dim dt As New DataTable

' create a relationship between the two tables
ds.Relations.Add(New DataRelation("__RELATIONSHIP__", dc_right, dc_left, False))

For Each dr_left In dt_left.Rows
    dr_dest = dt.NewRow

    ' copy the data from the columns of the Left table
    For Each column In dt_left.Columns
        dr_dest(column.ColumnName) = dr_left(column.ColumnName)
    Next

    ' get the related rows via the Relationship
    dr_right = dr_left.GetParentRow("__RELATIONSHIP__")

    If Not IsNull(dr_right) then
        ' copy the data from the columns of the Right table
        For Each column In dt_right.Columns
            dr_dest(column.ColumnName) = dr_right(column.ColumnName)
        Next
    End If

    ' add the row to the ResultSet
    dt.Rows.Add(dr_dest)
Next

' delete the temporary relationship
ds.Relations.Remove("__RELATIONSHIP__")

Let's examine this example a little closer. The first thing we do is create a new DataRelation between the two tables. The last argument in the DataRelations constructor is set to False because we do not want to create a corresponding foreign key constraint to go along with our relationship.

' create a relationship between the two tables
ds.Relations.Add(New DataRelation("__RELATIONSHIP__", dc_right, dc_left, False))

Note: While it's true that most SQL joins are performed on Primary Keys in one table against Foreign Keys on the other table, that is not a requirement. So, we can't assume that creating a foreign key constraint would be successful.

The next step is to copy all of the data from the columns in the left table. Since this example is a LEFT join, all rows from the left table should appear in the result set.

    ' copy the data from the columns of the left table
    For Each column In dt_left.Columns
        dr_dest(column.ColumnName) = dr_left(column.ColumnName)
    Next

This step is where the magic of Relations comes to play... we take a row from the left table and then get the related row from the right table. Then we copy the data from the right table's row into the result set just like we did with the left table.

    ' get the related rows via the Relationship
    dr_right = dr_left.GetParentRow("__RELATIONSHIP__")

    If Not IsNull(dr_right) then
        ' copy the data from the columns of the Right table
        For Each column In dt_right.Columns
            dr_dest(column.ColumnName) = dr_right(column.ColumnName)
        Next
    End If

Real-World Example

OK, now that you've got the basics of doing Relationship joins, let's look at a real-world function that can join two tables using LEFT, RIGHT, INNER, and FULL join methods. In this demo function, we create unique column names for the Result Set, since column name collisions are otherwise possible.

 ' Join two tables together
Private Function JoinTables(ByVal LeftTable As DataTable, ByVal RightTable As DataTable, _
 ByVal LeftColumn As String, ByVal RightColumn As String, ByVal JoinType _
 As String) As DataTable
    Dim dr_left, dr_right, dr_dest, dr_temp, drs() As DataRow
    Dim dt, left_dt, right_dt As DataTable
    Dim right_col, left_col, column As String
    Dim i As Integer
    Dim RightColumnList, LeftColumnList As ArrayList

    ' build the detached TempTable that will be returned
    dt = BuildTempTable()
    left_dt = LeftTable
    right_dt = RightTable
    left_col = LeftColumn
    right_col = RightColumn

    ' what kind of join is this?
    Select Case JoinType
        Case "INNER"
        Case "LEFT"
        Case "RIGHT"
            ' We don't really do right joins... we just flip the arguments and
            ' do a left join instead.
            left_dt = RightTable
            right_dt = LeftTable
            left_col = RightColumn
            right_col = LeftColumn
            JoinType = "LEFT"
        Case "FULL"
            ' A full join is kinda complicated... we do a right join, then a left,
            ' then combine the two together. The optimum execution plan for a full 
            ' join should be a "merge" operation... but let's not make things any
            ' more complicated than they already are.
            Dim temp As DataTable
            temp = JoinTables(RightTable, LeftTable, RightColumn, LeftColumn, "LEFT")
            dt = JoinTables(LeftTable, RightTable, LeftColumn, RightColumn, "LEFT")

            drs = temp.Select(TabColName(LeftTable.TableName, LeftColumn) & " is null")
            For Each dr_right In drs
                dt.ImportRow(dr_right)
            Next
            Return dt
        Case Else
            Throw New ApplicationException("SQL syntax error: Unknown Join type '" & _
             JoinType & "'")
    End Select

    ' create a relationship between the two tables
    ds.Relations.Add(New DataRelation("__RELATIONSHIP__", right_dt.Columns(right_col), _
    left_dt.Columns(left_col), False))

    ' let's go!
    dr_temp = dt.NewRow
    LeftColumnList = GetTableColumns(left_dt.TableName)
    RightColumnList = GetTableColumns(right_dt.TableName)
    For Each dr_left In left_dt.Rows

        ' Get the related rows from the "right" table
        drs = dr_left.GetParentRows("__RELATIONSHIP__")

        ' For inner joins, we don't record anything unless there is a matching row
        If UBound(drs) >= 0 Or JoinType <> "INNER" Then
            dr_dest = dt.NewRow

            ' Let's start by just copying the columns from the "left" table
            For Each column In LeftColumnList
                dr_dest(TabColName(left_dt.TableName, column)) = dr_left(column)
            Next

            ' There are three possibilities... there are no matching rows, there is
            ' only one related row, there are many related rows.
            Select Case UBound(drs)
                Case -1
                    ' Just record the row as it is now (with just the columns from
                    ' the left table).
                    dt.Rows.Add(dr_dest)
                Case 0
                    dr_right = drs(0)
                    For Each column In RightColumnList
                        dr_dest(TabColName(right_dt.TableName, column)) = dr_right(column)
                    Next
                    dt.Rows.Add(dr_dest)
                Case Else
                    ' Make a copy of the prototype datarow that we already filled in
                    ' above. It already has the column data from the left table.
                    dr_temp.ItemArray = dr_dest.ItemArray
                    For Each dr_right In drs
                        dr_dest = dt.NewRow

                        ' Copy prototype row (the left table's data)
                        dr_dest.ItemArray = dr_temp.ItemArray

                        ' Copy the columns from the related rows in the right table
                        For Each column In RightColumnList
                            dr_dest(TabColName(right_dt.TableName, column)) = dr_right(column)
                        Next
                        dt.Rows.Add(dr_dest)
                    Next
            End Select
        End If
    Next

    ' delete the temporary relationship we created above
    ds.Relations.Remove("__RELATIONSHIP__")

    Return dt
End Function

' Make a unique column name
Private Function MakeColumnName(ByVal TableName As String, ByVal ColumnName As _
 String) As String
    Return (TableName.Replace(" ", "_").Trim("[]".ToCharArray) & "_" & _
 ColumnName.Replace(" ", "_").Trim("[]".ToCharArray))
End Function

Note: The example above does not show the GetTableColumns() function which included in the source code example.

Downloads/Links

Read a related article on Using ADO Expressions
Read a related article on Using ADO DataRelations
Read a related article on In-Memory SQL Engine
Download the complete VB.Net Source code example used in this article: Join3.zip