For a programmer to fully understand and efficiently use LINQ queries, an experience is required in various areas such as the following:
LINQ is an innovative technology. It brings the power of SQL-like query syntax to application code.
Here is a first example
int[] numbers = { 8, 5, 2, 9, 3, 0, 7, 1, 6, 4 }; var q = from n in numbers where n < 5 orderby n select n; string S = ""; foreach (int num in q) S += num.ToString() + Environment.NewLine; MessageBox.Show(S);
The output of the above code is
0 1 2 3 4
A LINQ query starts with a from clause and ends with a select or group..by clause.
see also:
A LINQ query operation is performed in three steps.
Here is the above example again
// 1. define the datasource int[] numbers = { 8, 5, 2, 9, 3, 0, 7, 1, 6, 4 }; // 2. construct the query var q = from n in numbers where n < 5 orderby n select n; string S = ""; // 3. execute the query foreach (int num in q) S += num.ToString() + Environment.NewLine; MessageBox.Show(S);
Any type which implements the generic IEnumerable<T>, or a derived interface, is called queryable type and can be used as the datasource of a LINQ query operation.
public interface IEnumerable<T> : IEnumerable { IEnumerator<T> GetEnumerator(); }
Some types, such as types which implement the non-generic IEnumerable interface, can be cast to IEnumerable<T> and thus can be used as datasources with LINQ query operations. The ArrayList and Array classes are IEnumerable objects.
// 1. get the datasource int[] numbers = { 8, 5, 2, 9, 3, 0, 7, 1, 6, 4 };
The query is used to obtain, filter, sort, group and transform data. The query is actually an expression, called LINQ query expression, written in a SQL-like syntax.
The query expression, or the query variable, returns no data at all. It just represents the query commands. The actual query execution takes place at a later time.
The query expression may be stored to a local variable. That variable is of type IEnumerable<T>. It is easier and safer to use an implicitly typed local variable by using the keyword var and let compiler infer the actual data type.
// 2. construct the query var q = from n in numbers where n < 5 orderby n select n;
A LINQ query expression starts with the from clause. The from clause contains a reference to a datasource, which is a sequence, and a range variable which references an object of that sequence in any successive iteration.
A sequence is an object which implements the IEnumerable<T> generic interface, or it can be somehow presented as an IEnumerable<T>. In the above example the numbers array is a sequence.
The range variable of a query expression is declared in the from clause and it serves as a reference to each successive element of a sequence when the query is executed. Additional range variables may be declared in a let clause. In the above example the n variable is the range variable. The range variable can be implicitly typed for generic datasources and must be explicitly typed for non-generic datasources such as the ArrayList class.
var q = from int n in numbers where n < 5 orderby n select n;
The execution of a LINQ query expression is done by using a foreach statement. The foreach statement is where the query results are retrieved from the datasource. A foreach statement requires an IEnumerable or an IEnumerable<T> object.
// 3. execute the query foreach (int num in q) S += num.ToString() + Environment.NewLine;
Consider the following example
// 1. datasource List<Country> countries = new List<Country> { new Country {Code = "ES", Name = "Spain"}, new Country {Code = "CZ", Name = "Czech Republic"}, new Country {Code = "GR", Name = "Greece"}, new Country {Code = "DK", Name = "Denmark"}, new Country {Code = "CY", Name = "Cyprus"} }; // 2. query var q = from c in countries where c.Code.StartsWith("C") select c.Name; string S = ""; // 3. execution foreach (var countryName in q) S += countryName + Environment.NewLine; MessageBox.Show(S);
Any variable which participates in any of the three steps of a LINQ query operation is strongly-typed even if its type is not explicitly defined by the programmer. The compiler infers the actual type.
The following is equal to the previously used query expression.
IEnumerable<string> q = from c in countries where c.Code.StartsWith("C") select c.Name;
It is possible to even declare the type of the range variable, although in cases like the above is of a little value.
var q = from Country c in countries where c.Code.StartsWith("C") select c.Name;
The following, also, is equal to the above query execution.
foreach (string countryName in q) S += countryName + Environment.NewLine;
1. The element type of the datasource sequence determines the type of the range variable.
List<Country> countries = new List<Country> { new Country {Code = "ES", Name = "Spain"}, new Country {Code = "CZ", Name = "Czech Republic"}, new Country {Code = "GR", Name = "Greece"}, new Country {Code = "DK", Name = "Denmark"}, new Country {Code = "CY", Name = "Cyprus"} };
In the above example the element type is Country.
2. The type returned by the select clause determines the type of the whole query expression and thus the type of the query variable.
var q = from c in countries where c.Code.StartsWith("C") select c.Name;
In the above example the type returned by the select clause is string. So the type of the query variable and the type of the whole query expression is IEnumerable<string>, although it is declared as an implicitly typed variable.
3. The type of the query expression determines the type of the loop variable in the foreach statement.
foreach (var countryName in q) S += countryName + Environment.NewLine;
In the above example the type of the loop variable is string, although it is declared as an implicitly typed variable.
As a direct consequence of the above rules any type error is caught at compile-time.
A query expression is written using special contextual keywords introduced in C# 3.0. The contextual keywords used in a LINQ query are: from, join, where, group..by, orderby, select, into, in, on, let, equals, ascending and descending.
The from, join, where, group..by, orderby, select, into and let keywords introduce a clause in the expression. The from clause is the first clause in a LINQ query. The last clause should be a group..by or select clause.
Apparently LINQ queries use the same clauses as a typical SQL query syntax. Additionaly there is a let clause.
The following data are used in the discussion that follows
class Country { public string Code; public string Name; public List<string> Cities; } class Customer { public string Name; public string Country; public double Debit; } List<Country> countries = new List<Country> { new Country {Code = "ES", Name = "Spain", Cities = new List<string>{"Madrid", "Barcelona", "Valencia"} }, new Country {Code = "CZ", Name = "Czech Republic", Cities = new List<string>{"Prague", "Brno"} }, new Country {Code = "GR", Name = "Greece", Cities = new List<string>{"Athens", "Thessaloniki"} }, new Country {Code = "DK", Name = "Denmark", Cities = new List<string>{"Copenhagen", "Aarhus"} }, new Country {Code = "UK", Name = "United Kingdom", Cities = new List<string>{"London", "Edinburgh"} }, }; List<Customer> customers = new List<Customer> { new Customer {Name = "Cheap Software Ltd", Country = "GR", Debit = 35.6}, new Customer {Name = "Fast Internet Co", Country = "DK", Debit = 51.7}, new Customer {Name = "Neat Hardware", Country = "ES", Debit = 12.1}, new Customer {Name = "Cool Tech", Country = "DK", Debit = 23.5}, new Customer {Name = "Steady Logic", Country = "CZ", Debit = 49.4}, new Customer {Name = "Nasty Data", Country = "GR", Debit = 62.2} };
It is possible to have a compound from clause in order to access inner data.
var q = from country in countries from city in country.Cities where city.StartsWith("A") || city.StartsWith("B") orderby city descending select city; ... var q = from country in countries where (country.Code == "GR") || (country.Code == "DK") from customer in customers where (customer.Country == country.Code) && (customer.Debit > 40) orderby customer.Name select customer;
Subsequent from clauses resemble nested foreach statements
foreach (country in countries) { foreach (city in country.Cities) { ... } }
Queries containing compound from clauses could be written analytically.
var q = from country in countries where (country.Code == "GR") || (country.Code == "DK") select country.Code; var q2 = from customer in customers where q.Contains(customer.Country) && (customer.Debit > 40) select customer;
Joins in LINQ are always inner (equijoins) joins. A LINQ join joins two sequences in an equii-join based on some equality condition.
var q = from customer in customers join country in countries on customer.Country equals country.Code orderby country.Code select new { Name = customer.Name, Country = country.Name };
Since a join is always an inner join the order of the datasources in the expression has no impact at all. Thus the next example results in the same output as the above one.
var q = from country in countries join customer in customers on country.Code equals customer.Country orderby country.Code select new { Name = customer.Name, Country = country.Name };
The use of keyword into has a special effect when used with LINQ joins.
For each element of the left sequence it creates an IEnumerable<T> list, where T is the element type of the right sequence. That list contains those elements of the right sequence which pass the equality condition. This is grouping actually.
So joins using the keyword into are called group joins. There is no SQL equivalent for LINQ group join.
In general the keyword into is used to create an identifier (which is another implicitly typed local variable) which refers to the results of a group, join or select clause. That identifier can then be used further as a datasource for additional query commands.
var q = from country in countries join customer in customers on country.Code equals customer.Country into customersPerCountry orderby country.Code select new { Country = country.Name, Customers = customersPerCountry };
The left join simulation is done by using a second iteration (from clause) over the sequence produced by the group join and referenced by the keyword into, and forcing ALL the elements of that produced sequence into the final sequence, by using the DefaultIfEmpty() extension method .
var q = from country in countries join customer in customers on country.Code equals customer.Country into customersPerCountry orderby country.Code from v in customersPerCountry.DefaultIfEmpty() select new { Country = country.Name, Customer = (v == null ? string.Empty : v.Name) };
The result of a group..by clause is actually a
IEnumerable<IGrouping<TKey, TElement>>
that is a sequence of IGrouping<TKey, TElement> objects. Each IGrouping<TKey, TElement> object may contain zero or more elements that match the key condition for the group.
/* here, the implicitly typed variable q would be defined as IEnumerable<IGrouping<string, Customer>> q = ... */ var q = from customer in customers orderby customer.Country group customer by customer.Country;
Here is another group..by example using an into clause. The Key is a property of the IGrouping<TKey, TElement> interface.
var q = from customer in customers group customer by customer.Name[0] into g orderby g.Key select g;
A very frequent need is to group a datasource and then apply some aggregate function. One way to achieve this is to group data using an into clause and then call the appropriate aggregate extension method, possibly using a lambda expression.
var q = from customer in customers group customer by customer.Country into cc orderby cc.Key select new { Country = cc.Key, Debit = cc.Sum(c => c.Debit) };
The let keyword can be used inside a query expression in order to create and initialize additional range variables. Those range variables are considered as constants inside the expression. It may be used in further queries though, if they store a queryable type.
var q = from country in countries let DebitList = ( from customer in customers where (customer.Country == country.Code) select customer.Debit ) orderby country.Code select new { Country = country.Code, Debit = DebitList.Sum() };
In the above example the IEnumerable<T>.Sum() method is used through a range variable created with the let keyword.
see also:
A LINQ query expression is written using special contextual keywords, such as from, select, orderby etc. At compile-time the compiler converts those keywords to special method calls known as Standard Query Operators.
Standard query operators are methods defined as extension methods in special static classes and have names such as Select(), Where(), OrderBy() etc.
The Enumerable class is the container of those extension methods related to IEnumerable<T>. The Queryable class is the container of those extension methods related to IQueryable<T>.
Most of those standard query operators operate on sequences, that is objects implementing the IEnumerable<T> generic interface or a derived interface such as the IQueryable<T>, and provide capabilities similar to a SQL SELECT statement, that is filtering, sorting, grouping, ordering, aggregation and the like.
As a consequence a query expression can be expressed using standard query operators with exactly the same results.
int[] numbers = { 8, 5, 2, 9, 3, 0, 7, 1, 6, 4 }; var q = from n in numbers where n < 5 orderby n select n; ... var q = numbers.Where(n => n < 5) .OrderBy(n => n);
Query Syntax is the syntax that uses contextual keywords while Method Syntax is the syntax that uses standard query operators. Although query syntax is the recommended way there is no semantic difference between the two. Furthermore query syntax has its limits. For example there is no contextual keyword for aggregate functions such as Sum, Min or Max.
A query written in query syntax returns no data at all and remains un-executed. The query is executed and data is retrieved from the datasource when a variable of that query is used with a foreach statement. This is called Deferred Execution.
Immediate Execution happens when the query is executed by means other than a foreach statement. Some standard query operators such as the Count(), Min(), Max(), Average(), ToList(), ToArray() etc force an immediate execution.
var q = numbers.Where(n => n < 5) .OrderBy(n => n) .ToArray();
The above can be written in two parts as following.
var q = numbers.Where(n => n < 5) .OrderBy(n => n); var q2 = q.ToArray();
Calling ToList() and ToArray() creates a strongly-type IEnumerable<T> collection object which then contains the data obtained from the datasource.
A number of LINQ operations have no equivalent query syntax. Those methods are callable from inside a query expression though as any other available method.
Min() returns the minimum value in the sequence. Max() returns the maximum value in the sequence. Sum() returns the sum of the values in the sequence. Count() returns the number of elements in the sequence. LongCount() returns the number of elements in the sequence. Average() returns the average value in the sequence. Aggregate() performs a custom aggregation operation in the sequence.
AsEnumerable() returns the sequence as IEnumerable<T>. AsQueryable() returns the sequence as an IQueryable<T>. Cast() casts the elements of the sequence to a specified type. OfType() returns a new sequence containing only those elements that can be cast to the specified type. ToList() returns a List<T> containing all the elements of the sequence. ToArray() returns a Array<T> containing all the elements of the sequence. ToLookup() returns a Lookup<TKey, TElement>, an one-to-many dictionary (key-to-elements), based on a key condition. ToDictionary() returns a Dictionary<TKey, TValue>, an one-to-one dictionary (key-to-element), based on a key condition.
Distinct() returns a new sequence removing duplicate values from the sequence. Except() returns a new sequence containing the elements of the sequence that do not appear in a specified sequence (set difference). Intersect() returns a new sequence containing only the elements common to the sequence and to a specified sequence (set intersection). Union() returns a new sequence containing elements of the sequence and a specified sequence, excluding duplicates.
All() returns true if all the elements in the sequence satisfy a specified a condition. Any() returns true if any of the elements in the sequence satisfy a specified a condition. Contains() returns true if the sequence contains a specified element.
DefaultIfEmpty() returns a new sequence with a single element of the specified type and with a default value if the sequence is empty. Empty() returns an new empty collection. Range() returns a new sequence that contains a sequence of numbers. Repeat() returns a new sequence that contains one repeated value.
First() returns the first element in the sequence or the first element that satisfies a specified condition. Last() returns the last element in the sequence or the last element that satisfies a specified condition. ElementAt() returns the element at a specified index in the sequence. Single() returns the only element in the sequence or the only element that satisfies a specified condition. FirstOrDefault() as the First(). Alternatively it returns a default value if no such element exists. LastOrDefault() as the Last(). Alternatively it returns a default value if no such element exists. ElementAtOrDefault() as the ElementAt(). Alternatively it returns a default value if no such element exists. SingleOrDefault() as the Single(). Alternatively it returns a default value if no such element exists.
Skip() returns a new sequence containing elements from the specified index until the end of the sequence. Take() returns a new sequence containing elements from index zero to the specified index in the sequence. SkipWhile() skips elements in the sequence while the specified condition is true and returns the remaining elements TakeWhile() takes elementes in the sequence while the specified condition is true and ignores the rest.
Reverse() returns a new sequence containing the elements of the sequence in reverse order. SequenceEqual() returns true if this sequence equals to a specified sequence by comparing elements in a pair-wise manner. Concat() returns a new sequence by concatenating the sequence and a specified sequence.
Here is an example using the Lookup() method.
// the key is a bool value, odd and even numbers. // The Lookup() method creates a sub-list for each key ILookup<bool, int> lk = numbers.ToLookup(n => n % 2 == 0); // get the sub-list for odd numbers IEnumerable<int> list = lk[false]; string S = "odd numbers" + Environment.NewLine; foreach (int n in list) S += " " + n.ToString() + Environment.NewLine; // get the sub-list for even numbers list = lk[true]; S += "even numbers" + Environment.NewLine; foreach (int n in list) S += " " + n.ToString() + Environment.NewLine; MessageBox.Show(S);
Here is another example using the Intersect() and Last() methods.
int[] a = { 2, 9, 6, 4, 7}; int[] b = { 7, 1, 8, 5, 9 }; var v = a.Intersect(b); string S = ""; foreach (int n in v) { S += n.ToString(); if (n != v.Last()) S += ", "; } MessageBox.Show(S);
LINQ can be extended to use a number of different datasources, such as a CLR data structure (CLR type), an SQL database of some kind, a Web or WCF service, the the file system, etc.
To use LINQ against a specific type of datasource, an implementation of Standard Query Operators is required for that datasource.
CLR already provides an implementation of Standard Query Operators for any CLR data structure (CLR type), that is for in-memory-data. The System.Linq.Enumerable static class contains a full set of Standard Query Operators, already implemented as extension methods, for querying objects that implement IEnumerable<T>.
The only additional requirements for a type to make it LINQ-ready is to implement the IEnumerable<T> generic interface.
Here is an example.
public class CustomList: IEnumerable<int> { private List<int> list = new List<int> { 0, 8, 5, 2, 9, 3, 7, 1, 6, 4 }; public IEnumerator<int> GetEnumerator() { for (int i = 0; i < list.Count; i++) yield return list[i]; } System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() { for (int i = 0; i < list.Count; i++) yield return list[i]; } } ... CustomList list = new CustomList(); var q = from n in list where n >= 5 orderby n select n; string S = ""; foreach (int n in q) S += n.ToString() + Environment.NewLine; MessageBox.Show(S);
If the predefined functionality provided by the Enumerable class extension methods is not the desirable functionality, then the programmer has to implenent a full custom set of Standard Query Operators either as instance methods to the type or as extension methods for that type.
To make a remote datasource, such as an SQL database of some kind or a Web-WCF service, LINQ-ready, the implementation of a LINQ Provider is required.
A LINQ provider is a library or API which provides an implementation of the Standard Query Operators for a specific datasource. The actual translation of query expressions to standard query operators is done by a LINQ provider.
There is already a predefined set of LINQ providers in BCL
A LINQ Provider for a remote datasource has to implement the IQueryable<T> generic interface, at least, and possibly the IOrderedQueryable<T> and IQueryProvider interfaces. A provider which implements the IQueryable<T> receives LINQ queries in the form of Expression Trees which it then executes in a way best suited to the specific datasource. A LINQ provider can also define operations for inserting, updating and deleting data from/to the datasource it represents.
Expreesion Trees are described elsewhere.
see also:
LINQ to objects means the use of LINQ queries with any datasource which is an IEnumerable or an IEnumerable<T> object, directly and without any specialized LINQ provider intervention. LINQ to objects is used to query datasources like an array, an ArrayList, a List<T>, a Dictionary<TKey, TValue> and the like.
All examples so far were LINQ to Objects applications.
Here is another LINQ to Objects example, which queries the file system.
/* returns a list of FileInfo objects for files found in path */ List<FileInfo> GetFileListOf(string path) { List<FileInfo> Result = new List<FileInfo>(); string[] files = Directory.GetFiles(path, "*.*", SearchOption.AllDirectories); foreach (string s in files) Result.Add(new FileInfo(s)); return Result; } ... /* groups and presents file names by extension */ string path = Path.GetFullPath(@"..\..\"); List<FileInfo> fileList = GetFileListOf(path); var q = from FI in fileList group FI.Name by FI.Extension.ToUpper() into g orderby g.Key select g; string S = ""; foreach (var g in q) { S += g.Key + Environment.NewLine; foreach (string fileName in g) S += " " + fileName + Environment.NewLine; } MessageBox.Show(S);
LINQ to DataSet provides the means for querying DataTable objects. LINQ to DataSet functionality is provided by the DataTableExtensions and DataRowExtensions classes which contain a set of extension methods relative to DataTable and DataRow classes.
A DataTable can be converted to an IEnumerable<DataRow> sequence with a call to AsEnumerable() method
public static EnumerableRowCollection<DataRow> AsEnumerable(this DataTable source);
and an IEnumerable<DataRow> sequence can be converted back to a DataTable with a call to CopyToDataTable() method
public static DataTable CopyToDataTable<T>(this IEnumerable<T> source) where T : DataRow;
A column value is accessed by using the overloaded Field() generic method provided by the DataRowExtensions class
public static T Field<T>(this DataRow row, string columnName); public static T Field<T>(this DataRow row, int columnIndex);
There is a problem with that implementation of the CopyToDataTable() though, since it defines a constraint, which means that only IEnumerable<DataRow> sequences may be passed to the method. But frequently when processing a join for example, the easiest way is to create sequences of anonymous type objects. Unfortunatelly the CopyToDataTable() restricts the element type to DataRow.
There is a solution though found at internet. Here it is, slightly modified by the author of this tutorial.
/* taken from: http://msdn.microsoft.com/en-us/library/bb669096.aspx and http://blogs.msdn.com/aconrad/archive/2007/09/07/science-project.aspx This class provides the code needed to implement overloads of the CopyToDataTable() extension method for the IEnumerable<T> without a generic constraint as the other versions, those contained in the DataTableExtensions class, do. So these new overloads of the CopyToDataTable() can be used with any object returned by a query expression, even anonymous types. */ public class ObjectShredder<T> { private System.Reflection.FieldInfo[] _fi; private System.Reflection.PropertyInfo[] _pi; private System.Collections.Generic.Dictionary<string, int> _ordinalMap; private System.Type _type; // ObjectShredder constructor. public ObjectShredder() { _type = typeof(T); _fi = _type.GetFields(); _pi = _type.GetProperties(); _ordinalMap = new Dictionary<string, int>(); } /// <summary> /// Loads a DataTable from a sequence of objects. /// </summary> /// <param name="source">The sequence of objects to load into the DataTable.</param> /// <param name="table">The input table. The schema of the table must match that /// the type T. If the table is null, a new table is created with a schema /// created from the public properties and fields of the type T.</param> /// <param name="options">Specifies how values from the source sequence will be applied to /// existing rows in the table.</param> /// <returns>A DataTable created from the source sequence.</returns> public DataTable Shred(IEnumerable<T> source, DataTable table, LoadOption? options) { // Load the table from the scalar sequence if T is a primitive type. if (typeof(T).IsPrimitive) { return ShredPrimitive(source, table, options); } // Create a new table if the input table is null. if (table == null) { table = new DataTable(typeof(T).Name); } // Initialize the ordinal map and extend the table schema based on type T. table = ExtendTable(table, typeof(T)); // Enumerate the source sequence and load the object values into rows. table.BeginLoadData(); using (IEnumerator<T> e = source.GetEnumerator()) { while (e.MoveNext()) { if (options != null) { table.LoadDataRow(ShredObject(table, e.Current), (LoadOption)options); } else { table.LoadDataRow(ShredObject(table, e.Current), true); } } } table.EndLoadData(); // Return the table. return table; } public DataTable ShredPrimitive(IEnumerable<T> source, DataTable table, LoadOption? options) { // Create a new table if the input table is null. if (table == null) { table = new DataTable(typeof(T).Name); } if (!table.Columns.Contains("Value")) { table.Columns.Add("Value", typeof(T)); } // Enumerate the source sequence and load the scalar values into rows. table.BeginLoadData(); using (IEnumerator<T> e = source.GetEnumerator()) { Object[] values = new object[table.Columns.Count]; while (e.MoveNext()) { values[table.Columns["Value"].Ordinal] = e.Current; if (options != null) { table.LoadDataRow(values, (LoadOption)options); } else { table.LoadDataRow(values, true); } } } table.EndLoadData(); // Return the table. return table; } public object[] ShredObject(DataTable table, T instance) { FieldInfo[] fi = _fi; PropertyInfo[] pi = _pi; if (instance.GetType() != typeof(T)) { // If the instance is derived from T, extend the table schema // and get the properties and fields. ExtendTable(table, instance.GetType()); fi = instance.GetType().GetFields(); pi = instance.GetType().GetProperties(); } // Add the property and field values of the instance to an array. Object[] values = new object[table.Columns.Count]; foreach (FieldInfo f in fi) { values[_ordinalMap[f.Name]] = f.GetValue(instance); } foreach (PropertyInfo p in pi) { values[_ordinalMap[p.Name]] = p.GetValue(instance, null); } // Return the property and field values of the instance. return values; } public DataTable ExtendTable(DataTable table, Type type) { // Extend the table schema if the input table was null or if the value // in the sequence is derived from type T. foreach (FieldInfo f in type.GetFields()) { if (!_ordinalMap.ContainsKey(f.Name)) { // Add the field as a column in the table if it doesn't exist // already. DataColumn dc = table.Columns.Contains(f.Name) ? table.Columns[f.Name] : table.Columns.Add(f.Name, f.FieldType); // Add the field to the ordinal map. _ordinalMap.Add(f.Name, dc.Ordinal); } } foreach (PropertyInfo p in type.GetProperties()) { if (!_ordinalMap.ContainsKey(p.Name)) { // Add the property as a column in the table if it doesn't exist // already. DataColumn dc = table.Columns.Contains(p.Name) ? table.Columns[p.Name] : table.Columns.Add(p.Name, p.PropertyType); // Add the property to the ordinal map. _ordinalMap.Add(p.Name, dc.Ordinal); } } // Return the table. return table; } } /* serves just as a container for the CopyToDataTable() extension methods */ public static class CustomLINQtoDataSetMethods { public static DataTable CopyToDataTable<T>(this IEnumerable<T> source) { if (source is IEnumerable<DataRow>) return DataTableExtensions.CopyToDataTable<DataRow>(source as IEnumerable<DataRow>); return new ObjectShredder<T>().Shred(source, null, null); } public static DataTable CopyToDataTable<T>(this IEnumerable<T> source, DataTable table, LoadOption options) { if (source is IEnumerable<DataRow>) { DataTableExtensions.CopyToDataTable<DataRow>(source as IEnumerable<DataRow>, table, options); return table; } return new ObjectShredder<T>().Shred(source, table, options); } }
All LINQ to DataSet examples use a MS Access database with the following schema and data.
create table CITY ( Country varchar(4), Name varchar(32) ); create table COUNTRY ( Code varchar(4), Name varchar(32) ); create table CUSTOMER ( Name varchar(32), Country varchar(4), Debit float ); insert into CITY values ('ES', 'Madrid'); insert into CITY values ('ES', 'Barcelona'); insert into CITY values ('ES', 'Valencia'); insert into CITY values ('CZ', 'Prague'); insert into CITY values ('CZ', 'Brno'); insert into CITY values ('GR', 'Athens'); insert into CITY values ('GR', 'Thessaloniki'); insert into CITY values ('DK', 'Copenhagen'); insert into CITY values ('DK', 'Aarhus'); insert into CITY values ('UK', 'London'); insert into CITY values ('UK', 'Edinburgh'); insert into COUNTRY values ('ES', 'Spain'); insert into COUNTRY values ('CZ', 'Czech Republic'); insert into COUNTRY values ('GR', 'Greece'); insert into COUNTRY values ('DK', 'Denmark'); insert into COUNTRY values ('UK', 'United Kingdom'); insert into CUSTOMER values ('Cheap Software Ltd', 'GR', 35.6); insert into CUSTOMER values ('Fast Internet Co', 'DK', 51.7); insert into CUSTOMER values ('Neat Hardware', 'ES', 12.1); insert into CUSTOMER values ('Cool Tech', 'DK', 23.5); insert into CUSTOMER values ('Steady Logic', 'CZ', 49.4); insert into CUSTOMER values ('Nasty Data', 'GR', 62.2);
Data are loaded when the sample application starts and they are available as DataTable objects.
public partial class MainForm : Form { ... ... /* static fields */ static private readonly string connectionString = @""; static private DbProviderFactory factory = null; /* static constructor */ static MainForm() { string S = Path.GetFullPath(@"..\..\..\LINQ.MDB"); connectionString = string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=""{0}"";User Id=admin;Password=;", S); factory = DbProviderFactories.GetFactory("System.Data.OleDb"); } /* a static method which executes SQL and returns a DataTable object */ static DataTable Select(string SQL) { DataTable Result = null; if (factory != null) { DbConnection con = factory.CreateConnection(); con.ConnectionString = connectionString; try { con.Open(); using (DbCommand cmd = con.CreateCommand()) { using (DbDataAdapter adapter = factory.CreateDataAdapter()) { cmd.CommandText = SQL; adapter.SelectCommand = cmd; Result = new DataTable(); adapter.Fill(Result); } } } finally { if (con.State == ConnectionState.Open) con.Close(); } } return Result; } /* instance fields */ DataTable tblCity = Select("select * from CITY"); DataTable tblCountry = Select("select * from COUNTRY"); DataTable tblCustomer = Select("select * from CUSTOMER"); ... ... }
The following examples are based on the examples presented earlier for the LINQ to Objects and properly modified.
The query variable is of type IEnumerable<DataRow>. The AsEnumerable() is used to convert a DataTable to an IEnumerable<DataRow> object. Thus the range variable, city in this example, is a DataRow object. Finally the CopyToDataTable() is used to convert the IEnumerable<DataRow> back to a DataTable which then can be used as a DataSource for a DataGridView.
var q = from city in tblCity.AsEnumerable() where city.Field<string>("Name").StartsWith("A") || city.Field<string>("Name").StartsWith("B") orderby city.Field<string>("Name") descending select city; Grid.DataSource = q.CopyToDataTable();
A sub-query based on a result list coming from the first query. Each from clause can be thought of as a separated foreach statement.
var q = from country in tblCountry.AsEnumerable() where (country.Field<string>("Code") == "GR") || (country.Field<string>("Code") == "DK") from customer in tblCustomer.AsEnumerable() where (customer.Field<string>("Country") == country.Field<string>("Code")) && (customer.Field<double>("Debit") > 40) orderby customer.Field<string>("Name") select customer; Grid.DataSource = q.CopyToDataTable();
Queries containing compound from clauses could be written analytically.
var q = from country in tblCountry.AsEnumerable() where (country.Field<string>("Code") == "GR") || (country.Field<string>("Code") == "DK") select country.Field<string>("Code"); var q2 = from customer in tblCustomer.AsEnumerable() where q.Contains(customer.Field<string>("Country")) && (customer.Field<double>("Debit") > 40) orderby customer.Field<string>("Name") select customer; Grid.DataSource = q2.CopyToDataTable();
Joins in LINQ are always inner (equijoins) joins. A LINQ join joins two sequences in an equii-join based on some equality condition.
The DataTableExtensions.CopyToDataTable<T>() extension method is a generic with a constraint for DataRow class. Meaning that it can be used with DataRow objects only. However in the next example the element type is an anonymous type. For this and similar examples to work a solution is provided taken from
http://msdn.microsoft.com/en-us/library/bb669096.aspx
which adds another CopyToDataTable() overloaded version without any type constraint.
var q = from customer in tblCustomer.AsEnumerable() join country in tblCountry.AsEnumerable() on customer.Field<string>("Country") equals country.Field<string>("Code") orderby country.Field<string>("Code") select new { Name = customer.Field<string>("Name"), Country = country.Field<string>("Name") }; Grid.DataSource = q.CopyToDataTable();
Since a join is always an inner join the order of the datasources in the expression has no impact at all. Thus the next example results in the same output as the above one.
var q = from country in tblCountry.AsEnumerable() join customer in tblCustomer.AsEnumerable() on country.Field<string>("Code") equals customer.Field<string>("Country") orderby country.Field<string>("Code") select new { Name = customer.Field<string>("Name"), Country = country.Field<string>("Name") }; Grid.DataSource = q.CopyToDataTable();
This and the next examples which group data, create a DataSet which holds a DataTable for each group. Here are the helper methods
/* helper method: creates and returns a DataTable based on TableName and Rows */ DataTable ConstructTable(string TableName, IEnumerable<DataRow> Rows) { DataTable Result; if (Rows.Count() > 0) Result = Rows.CopyToDataTable(); else Result = new DataTable(); Result.TableName = TableName; return Result; } /* helper method: returns a DataSet after placing all tables into its Tables and TableNames into cboTables.Items */ DataSet ConstructDataset(IEnumerable<DataTable> tables) { DataSet Result = new DataSet(); cboTables.Items.Clear(); foreach (DataTable table in tables) { Result.Tables.Add(table); cboTables.Items.Add(table.TableName); } cboTables.SelectedIndex = 0; return Result; }
The next example joins the two tables into customersPerCountry, an IEnumerable<DataRow> object, and then it calls the helper ConstructTable() method in order to create a DataTable for the current group. Here the q implicitly typed local variable is a IEnumerable<DataTable> object.
var q = from country in tblCountry.AsEnumerable() join customer in tblCustomer.AsEnumerable() on country.Field<string>("Code") equals customer.Field<string>("Country") into customersPerCountry orderby country.Field<string>("Code") select ConstructTable(country.Field<string>("Name"), customersPerCountry); Grid.DataSource = ConstructDataset(q).Tables[0];
As with the LINQ to Objects, the left join simulation is done by using a second iteration (from clause) over the sequence produced by the group join and referenced by the keyword into, and forcing ALL the elements of that produced sequence into the final sequence, by using the DefaultIfEmpty() extension method .
var q = from country in tblCountry.AsEnumerable() join customer in tblCustomer.AsEnumerable() on country.Field<string>("Code") equals customer.Field<string>("Country") into customersPerCountry orderby country.Field<string>("Code") from v in customersPerCountry.DefaultIfEmpty() select new { Country = country.Field<string>("Name"), Customer = (v == null ? string.Empty : (string)v["Name"]) }; Grid.DataSource = q.CopyToDataTable();
This example is slightly different from the corresponding one in the LINQ to Objects examples. An into clause is added and the grouping result is passed to the ConstructTable().
The q implicitly typed local variable is a IEnumerable<DataTable> object, while the g is a IGrouping<string, DataRow>.
/* here, the implicitly typed variable q would be defined as IEnumerable<DataTable> q = ... */ var q = from customer in tblCustomer.AsEnumerable() orderby customer.Field<string>("Country") group customer by customer.Field<string>("Country") into g select ConstructTable(g.Key, g.ToList()); Grid.DataSource = ConstructDataset(q).Tables[0];
Another group..by example using an into clause.
var q = from customer in tblCustomer.AsEnumerable() group customer by customer.Field<string>("Name")[0] into g orderby g.Key select ConstructTable(g.Key.ToString(), g.ToList()); Grid.DataSource = ConstructDataset(q).Tables[0];
Grouping and using aggregate functions (sum).
var q = from customer in tblCustomer.AsEnumerable() group customer by customer.Field<string>("Country") into cc orderby cc.Key select new { Country = cc.Key, Debit = cc.Sum(c => c.Field<double>("Debit")) }; Grid.DataSource = q.CopyToDataTable();
Almost the same as with the LINQ to Objects.
var q = from country in tblCountry.AsEnumerable() let DebitList = ( from customer in tblCustomer.AsEnumerable() where (customer.Field<string>("Country") == country.Field<string>("Code")) select customer.Field<double>("Debit")) orderby country.Field<string>("Code") select new { Country = country.Field<string>("Code"), Debit = DebitList.Sum() }; Grid.DataSource = q.CopyToDataTable();