c# – 批处理数据处理：Visual Studio

2024年1月6日 221次阅读

我在C#上工作,我从1个数据库读取一个巨大的表并将其加载到我的DataTable.

由于该表包含大量行(1,800,000)并且我不断收到内存不足的错误,我试图将其分解并一次复制100,000行并清理内存并重做直到所有数据都在来自源的表被加载到我的DataTable.

你能看看我的代码并告诉我,我是否走在正确的轨道上？从我看来,我一次又一次地阅读前100,000行,我的程序无限期地运行.

我需要在我的DataTable中添加一个计数器吗？这样它就会添加下一组行???
我的代码段如下：

    public IoSqlReply GetResultSet(String directoryName, String userId, String password, String sql)
    {
        IoSqlReply ioSqlReply = new IoSqlReply();
        DataTable dtResultSet = new DataTable();
        IoMsSQL ioMsSQL = null;
        int chunkSize = 100000;
        try
        {
            using (OdbcConnection conn = new OdbcConnection(cs))
            {
                conn.Open();

                using (OdbcCommand cmd = new OdbcCommand(sql, conn))
                {
                    using (OdbcDataReader reader = cmd.ExecuteReader())
                    {


                        for (int col = 0; col < reader.FieldCount; col++)
                        {
                            String colName = reader.GetName(col);
                            String colDataType = reader.GetFieldType(col).ToString(); ;

                            dtResultSet.Columns.Add(reader.GetName(col), reader.GetFieldType(col));
                        }
                                                 // now copy each row/column to the datatable

                        while (reader.Read())       // loop round all rows in the source table
                        {
                            DataRow row = dtResultSet.NewRow();

                            for (int ixCol = 0; ixCol < reader.FieldCount; ixCol++)     // loop round all columns in each row
                            {
                                row[ixCol] = reader.GetValue(ixCol);
                            }


                            // -------------------------------------------------------------
                            // finished processing the row, add it to the datatable
                            // -------------------------------------------------------------

                            dtResultSet.Rows.Add(row);

                                GC.Collect();       // free up memory

                        }//closing while

                        ioSqlReply.DtResultSet = dtResultSet;       // return the data table
                        ioSqlReply.RowCount = dtResultSet.Rows.Count;
                        Console.WriteLine("DTRESULTSET:ROW COUNT FINAL : " + dtResultSet.Rows.Count);
                        ioSqlReply.Rc = 0;
                    }
                }
            }
        }

最佳答案你应该限制你的Sql中的行数,例如……

SELECT TOP 10000 * FROM SomeTable;

如果你不这样做,并且你的查询中有1.8M,那么就没有系统能够处理它.

但是这将使你的应用程序只处理前10000行…如果你需要处理所有行,那么你应该迭代执行那个sql unitl没有更多的行…例如

public IoSqlReply GetResultSet(String directoryName, String userId, String password, String sql)
{
    IoSqlReply ioSqlReply = new IoSqlReply();
    DataTable dtResultSet = new DataTable();
    IoMsSQL ioMsSQL = null;
    bool keepProcessing = true;

    try
    {
        using (OdbcConnection conn = new OdbcConnection(cs))
        {
            conn.Open();

            while (keepProcessing)
            {
               using (OdbcCommand cmd = new OdbcCommand(sql, conn))
               {
                   using (OdbcDataReader reader = cmd.ExecuteReader())
                   {

                      if (reader.HasRows)
                      {
                        for (int col = 0; col < reader.FieldCount; col++)
                        {
                           String colName = reader.GetName(col);
                           String colDataType = reader.GetFieldType(col).ToString(); ;

                           dtResultSet.Columns.Add(reader.GetName(col),     reader.GetFieldType(col));
                        }
                        // now copy each row/column to the datatable

                        while (reader.Read())       // loop round all rows in the source table
                        {
                            DataRow row = dtResultSet.NewRow();

                            for (int ixCol = 0; ixCol < reader.FieldCount; ixCol++)     // loop round all columns in each row
                            {
                                row[ixCol] = reader.GetValue(ixCol);
                            }


                           // -------------------------------------------------------------
                           // finished processing the row, add it to the datatable
                           // -------------------------------------------------------------

                            dtResultSet.Rows.Add(row);

                            GC.Collect();       // free up memory

                        }//closing while

                        ioSqlReply.DtResultSet = dtResultSet;       // return the data table
                        ioSqlReply.RowCount = dtResultSet.Rows.Count;
                        Console.WriteLine("DTRESULTSET:ROW COUNT FINAL : " + dtResultSet.Rows.Count);
                        ioSqlReply.Rc = 0;
                      }
                      else
                      {
                        keepProcessing = false;
                      }
                   }
                }
            }
        }
    }

这是一个非常粗略的例子……它可以改进,但我认为这是一个很容易解决你的问题.