STK++ 0.9.13
STK::DataHandler Class Reference

implementation of the DataHandlerBase class using ReadWriteCsv and Array2D. More...

#include <STK_DataHandler.h>

Inheritance diagram for STK::DataHandler:
Inheritance graph

Public Types

typedef DataHandlerBase< DataHandlerBase
 
typedef DataHandlerBase< DataHandler >::InfoMap InfoMap
 
- Public Types inherited from STK::DataHandlerBase< DataHandler >
typedef std::map< std::string, std::string > InfoMap
 

Public Member Functions

 DataHandler ()
 default constructor
 
 ~DataHandler ()
 destructor
 
ReadWriteCsv constdata () const
 get the whole data set
 
ReadWriteCsv constdescriptor () const
 get the whole descriptor set
 
int nbSample () const
 
int nbVariable () const
 
void setWithNames (bool withNames)
 set withNames flag
 
bool readDataFromCsvFile (std::string const &datafile, std::string descriptorfile)
 read a data file and its companion description file.
 
bool readDataFromCsvFile (std::string const &datafile, std::string const &idData, std::string const &idModel)
 read a data set from an Array2D.
 
template<typename Type >
bool readDataFromArray2D (Array2D< Type > const &data, std::string const &idData, std::string const &idModel)
 read a data set from an Array2D.
 
template<typename Array >
bool readDataFromArray (ExprBase< Array > const &data, std::string const &idData, std::string const &idModel)
 read a data set from an Array or Expression.
 
template<typename Type >
void getData (std::string const &idData, Array2D< Type > &data) const
 
void removeData (std::string const &idData)
 remove the data with the given idData
 
- Public Member Functions inherited from STK::DataHandlerBase< DataHandler >
 ~DataHandlerBase ()
 destructor
 
InfoMap constinfo () const
 
bool addInfo (std::string const &idData, std::string const &idModel)
 Add an info descriptor to the data handler.
 
bool getIdModelName (std::string const &idData, std::string &idModel) const
 Giving the Id of a data set, find the Id of the model.
 
void writeInfo (ostream &os) const
 write infoMap on os
 
- Public Member Functions inherited from STK::IRecursiveTemplate< Derived >
Derived & asDerived ()
 static cast : return a reference of this with a cast to the derived class.
 
Derived constasDerived () const
 static cast : return a const reference of this with a cast to the derived class.
 
Derived * asPtrDerived ()
 static cast : return a ptr on a Derived of this with a cast to the derived class.
 
Derived constasPtrDerived () const
 static cast : return a ptr on a constant Derived of this with a cast to the derived class.
 
Derived * clone () const
 create a leaf using the copy constructor of the Derived class.
 
Derived * clone (bool isRef) const
 create a leaf using the copy constructor of the Derived class and a flag determining if the clone is a reference or not.
 

Protected Member Functions

std::vector< intcolIndex (std::string const &idData) const
 lookup on the descriptors in order to get the columns of the ReadWriteCsv with the Id idData.
 
- Protected Member Functions inherited from STK::DataHandlerBase< DataHandler >
 DataHandlerBase ()
 default constructor
 
- Protected Member Functions inherited from STK::IRecursiveTemplate< Derived >
 IRecursiveTemplate ()
 constructor.
 
 ~IRecursiveTemplate ()
 destructor.
 

Private Attributes

bool withNames_
 first line with names ?
 
ReadWriteCsv data_
 data files
 
ReadWriteCsv descriptor_
 descriptor files with two rows.
 

Additional Inherited Members

- Protected Attributes inherited from STK::DataHandlerBase< DataHandler >
InfoMap info_
 Store the informations of the mixtures in the form (idData, idModel) with.
 

Detailed Description

implementation of the DataHandlerBase class using ReadWriteCsv and Array2D.

The DataHandler class allow to read various csv files with their description files and to get the columns identified by an idData in an Array2D. All data are stored in memory in a ReadWriteCsv structure.

Definition at line 69 of file STK_DataHandler.h.

Member Typedef Documentation

◆ Base

◆ InfoMap

Constructor & Destructor Documentation

◆ DataHandler()

STK::DataHandler::DataHandler ( )
inline

default constructor

Definition at line 75 of file STK_DataHandler.h.

75 : Base(), withNames_(false)
76 { data_.setWithNames(false); descriptor_.setWithNames(false);}
ReadWriteCsv descriptor_
descriptor files with two rows.
bool withNames_
first line with names ?
ReadWriteCsv data_
data files
DataHandlerBase< DataHandler > Base

References data_, and descriptor_.

◆ ~DataHandler()

STK::DataHandler::~DataHandler ( )
inline

destructor

Definition at line 78 of file STK_DataHandler.h.

78{}

Member Function Documentation

◆ colIndex()

std::vector< int > STK::DataHandler::colIndex ( std::string const idData) const
protected

lookup on the descriptors in order to get the columns of the ReadWriteCsv with the Id idData.

Parameters
idDataid of the data to get

Definition at line 138 of file STK_DataHandler.cpp.

139{
140 int rowIdData = descriptor_.beginRows()+1;
141 std::vector<int> colindex;
142 for (int i = descriptor_.beginCols(); i <= descriptor_.lastIdxCols(); ++i)
143 { if (descriptor_.var(i).at(rowIdData) == idData) colindex.push_back(i);}
144 return colindex;
145}

References descriptor_.

Referenced by getData().

◆ data()

ReadWriteCsv const & STK::DataHandler::data ( ) const
inline

get the whole data set

Definition at line 80 of file STK_DataHandler.h.

80{ return data_;}

References data_.

Referenced by getData(), readDataFromArray(), readDataFromArray2D(), and readDataFromCsvFile().

◆ descriptor()

ReadWriteCsv const & STK::DataHandler::descriptor ( ) const
inline

get the whole descriptor set

Definition at line 82 of file STK_DataHandler.h.

82{ return descriptor_;}

References descriptor_.

◆ getData()

template<typename Type >
void STK::DataHandler::getData ( std::string const idData,
Array2D< Type > &  data 
) const
Returns
in an Array2D<Type> the data with the given idData

Definition at line 148 of file STK_DataHandler.h.

149{
150 std::vector<int> indexes = colIndex(idData);
151#ifdef STK_MIXTURE_VERY_VERBOSE
152 stk_cout << _T("In DataHandler::getData, idData = ") << idData << _T("\n");
153 stk_cout << _T("columns found = ");
154 for (std::vector<int>::const_iterator it = indexes.begin(); it != indexes.end(); ++it)
155 { stk_cout << (*it) << _T(" ");}
156 stk_cout << _T("\n");
157#endif
158 int nbVariable = indexes.size();
159 data.resize(nbSample(), nbVariable);
160 int j= data.beginCols();
161 for (std::vector<int>::const_iterator it = indexes.begin(); it != indexes.end(); ++it, ++j)
162 {
163 for (int i = data_.firstRow(*it); i <= data_.lastRow(*it); ++i)
164 { data(i, j) = stringToType<Type>(data_(i,*it));}
165 }
166}
#define stk_cout
Standard stk output stream.
#define _T(x)
Let x unmodified.
int nbSample() const
int nbVariable() const
std::vector< int > colIndex(std::string const &idData) const
lookup on the descriptors in order to get the columns of the ReadWriteCsv with the Id idData.
ReadWriteCsv const & data() const
get the whole data set

References _T, colIndex(), data(), data_, nbSample(), nbVariable(), and stk_cout.

◆ nbSample()

int STK::DataHandler::nbSample ( ) const
inline
Returns
the number of sample (the number of rows of the data)

Definition at line 84 of file STK_DataHandler.h.

84{ return data_.sizeRows();}

References data_.

Referenced by getData().

◆ nbVariable()

int STK::DataHandler::nbVariable ( ) const
inline
Returns
the number of sample (the number of columns of the data)

Definition at line 86 of file STK_DataHandler.h.

86{ return data_.size();}

References data_.

Referenced by getData().

◆ readDataFromArray()

template<typename Array >
bool STK::DataHandler::readDataFromArray ( ExprBase< Array > const data,
std::string const idData,
std::string const idModel 
)

read a data set from an Array or Expression.

This method should be essentially used:

  • for testing some statistical method as the data will be converted in a String format (whcih is not an efficient way to store the data..).
  • if the data are already stored in a String format.
    Parameters
    datathe data set
    idDatathe id of the data
    idModelan id identifying the model to use with the data set

Definition at line 191 of file STK_DataHandler.h.

194{
195 // add descriptor
196 Variable<std::string> desc(2, stringNa);
197 desc[baseIdx] = idModel ; desc[baseIdx+1] = idData;
198 if (!addInfo(idData, idModel)) return false;
199 // store data at the end of the ReadWriteCsv array in a string format
200 for (int j=data.beginCols(); j<= data.lastIdxCols(); ++j)
201 {
202 data_.push_back();
203 data_.back().resize(data.rows());
204 for (int i= data.beginRows(); i < data.endRows(); ++i)
205 { data_.back()[i] = typeToString(data.elt(i,j), std::scientific);}
206 // store descriptor : this is the same for all the columns added
207 descriptor_.push_back(desc);
208 }
209 return true;
210}
bool addInfo(std::string const &idData, std::string const &idModel)
Add an info descriptor to the data handler.
String typeToString(Type const &t, std::ios_base &(*f)(std::ios_base &)=std::dec)
convert a Type to String
Definition STK_String.h:235
String stringNa
Representation of a Not Available value.
const int baseIdx
base index of the containers created in STK++.

References STK::DataHandlerBase< DataHandler >::addInfo(), STK::baseIdx, data(), data_, descriptor_, STK::stringNa, and STK::typeToString().

◆ readDataFromArray2D()

template<typename Type >
bool STK::DataHandler::readDataFromArray2D ( Array2D< Type > const data,
std::string const idData,
std::string const idModel 
)

read a data set from an Array2D.

This method should be essentially used:

  • for testing some statistical method as the data will be converted in a String format (whcih is not an efficient way to store the data..).
  • if the data are already stored in a String format.
    Parameters
    datathe data set
    idDatathe id of the data
    idModelan id identifying the model to use with the data set

Definition at line 169 of file STK_DataHandler.h.

172{
173 // add descriptor
174 Variable<std::string> desc(2, stringNa);
175 desc[baseIdx] = idModel ; desc[baseIdx+1] = idData;
176 if (!addInfo(idData, idModel)) return false;
177 // store data at the end of the ReadWriteCsv array in a string format
178 for (int j=data.beginCols(); j<= data.lastIdxCols(); ++j)
179 {
180 data_.push_back();
181 data_.back().resize(data.rows());
182 for (int i= data.beginRows(); i < data.endRows(); ++i)
183 { data_.back()[i] = typeToString(data(i,j), std::scientific);}
184 // store descriptor : this is the same for all the columns added
185 descriptor_.push_back(desc);
186 }
187 return true;
188}

References STK::DataHandlerBase< DataHandler >::addInfo(), STK::baseIdx, data(), data_, descriptor_, STK::stringNa, and STK::typeToString().

◆ readDataFromCsvFile() [1/2]

bool STK::DataHandler::readDataFromCsvFile ( std::string const datafile,
std::string const idData,
std::string const idModel 
)

read a data set from an Array2D.

read a data file

Parameters
datafilethe data file to get
idDatathe id of the data
idModelan id identifying the model to use with the data set

Definition at line 41 of file STK_DataHandler.cpp.

44{
45 ReadWriteCsv data(datafile);
46 // no names to read in the first line
47 data.setWithNames(withNames_);
48 // read the data set
49 if (!data.read())
50 {
51 stk_cerr << _T("An error occur when reading the data file.\nWhat: ")
52 << data.error();
53 return false;
54 }
55 // add descriptor
56 Variable<std::string> desc(2, stringNa);
57 desc[baseIdx] = idModel ; desc[baseIdx+1] = idData;
58 // store data and descriptors
59 if (!addInfo(idData, idModel)) return false;
60 data_ += data;
61 // store descriptor : this is the same for all the columns added
62 for (int j=data.beginCols(); j< data.endCols(); ++j)
63 { descriptor_.push_back(desc);}
64 return true;
65}
#define stk_cerr
Standard stk error stream.
String const & error() const
get the last error message.
Definition STK_IRunner.h:82
class TReadWriteCsv< String > ReadWriteCsv

References _T, STK::DataHandlerBase< DataHandler >::addInfo(), STK::baseIdx, data(), data_, descriptor_, STK::IRunnerBase::error(), stk_cerr, STK::stringNa, and withNames_.

◆ readDataFromCsvFile() [2/2]

bool STK::DataHandler::readDataFromCsvFile ( std::string const datafile,
std::string  descriptorfile 
)

read a data file and its companion description file.

Definition at line 68 of file STK_DataHandler.cpp.

69{
70 ReadWriteCsv rwdata(datafile);
71 // no names to read in the first line
72 rwdata.setWithNames(withNames_);
73 // read the data set
74 if (!rwdata.read())
75 {
76 stk_cerr << _T("An error occur when reading the data file.\nWhat: ")
77 << rwdata.error();
78 return false;
79 }
80 ReadWriteCsv rwdesc(descriptorfile);
81 // no names to read in the first line
82 rwdesc.setWithNames(false);
83 // read the data set
84 if (!rwdesc.read())
85 {
86 stk_cerr << _T("An error occur when reading the descriptor file.\nWhat: ")
87 << rwdesc.error();
88 return false;
89 }
90 // check logic
91 if (rwdata.size() != rwdesc.size())
92 {
93 stk_cerr << _T("Data file and descriptor file does not have the same number of column.\n");
94 return false;
95 }
96 if (rwdata.sizeRows() == 0)
97 {
98 stk_cerr << _T("No data.\n");
99 return false;
100 }
101 if (rwdesc.sizeRows() < 2)
102 {
103 stk_cerr << _T("No descriptor.\n");
104 return false;
105 }
106 // parse descriptor file
107 int firstRow = rwdesc.beginRows();
108 for (int j=rwdesc.beginCols(); j< rwdesc.endCols(); j++)
109 {
110 std::string idModel = rwdesc.at(j).at(firstRow);
111 std::string idData = rwdesc.at(j).at(firstRow+1);
112 if (!addInfo(idData, idModel)) return false;
113 }
114 // store data and descriptors
115 data_ += rwdata;
116 descriptor_ += rwdesc;
117 return true;
118}

References _T, STK::DataHandlerBase< DataHandler >::addInfo(), data_, descriptor_, STK::IRunnerBase::error(), stk_cerr, and withNames_.

◆ removeData()

void STK::DataHandler::removeData ( std::string const idData)

remove the data with the given idData

Definition at line 121 of file STK_DataHandler.cpp.

122{
123 int rowIdData = descriptor_.beginRows()+1;
124 for (int i = descriptor_.endCols()-1; i >= descriptor_.beginCols(); --i)
125 { if (descriptor_.var(i)[rowIdData] == idData)
126 {
127 data_.eraseColumn(i);
128 descriptor_.eraseColumn(i);
129 }
130 }
131 info_.erase(idData);
132}
InfoMap info_
Store the informations of the mixtures in the form (idData, idModel) with.

References data_, descriptor_, and STK::DataHandlerBase< DataHandler >::info_.

◆ setWithNames()

void STK::DataHandler::setWithNames ( bool  withNames)
inline

set withNames flag

Definition at line 89 of file STK_DataHandler.h.

89{ withNames_ = withNames;}

References withNames_.

Member Data Documentation

◆ data_

◆ descriptor_

ReadWriteCsv STK::DataHandler::descriptor_
private

descriptor files with two rows.

On the first row we get the idModel, on the second row, we get the idData

Definition at line 142 of file STK_DataHandler.h.

Referenced by colIndex(), DataHandler(), descriptor(), readDataFromArray(), readDataFromArray2D(), readDataFromCsvFile(), readDataFromCsvFile(), and removeData().

◆ withNames_

bool STK::DataHandler::withNames_
private

first line with names ?

Definition at line 136 of file STK_DataHandler.h.

Referenced by readDataFromCsvFile(), readDataFromCsvFile(), and setWithNames().


The documentation for this class was generated from the following files: