libmsnumpress
Numerical compression schemes for proteomics mass spectrometry data
|
Functions | |
static bool | is_little_endian () |
static void | encodeFixedPoint (double fixedPoint, unsigned char *result) |
static double | decodeFixedPoint (const unsigned char *data) |
static void | encodeInt (const unsigned int x, unsigned char *res, size_t *res_length) |
static void | decodeInt (const unsigned char *data, size_t *di, size_t max_di, size_t *half, unsigned int *res) |
double | optimalLinearFixedPointMass (const double *data, size_t dataSize, double mass_acc) |
double | optimalLinearFixedPoint (const double *data, size_t dataSize) |
size_t | encodeLinear (const double *data, size_t dataSize, unsigned char *result, double fixedPoint) |
size_t | decodeLinear (const unsigned char *data, const size_t dataSize, double *result) |
void | encodeLinear (const std::vector< double > &data, std::vector< unsigned char > &result, double fixedPoint) |
void | decodeLinear (const std::vector< unsigned char > &data, std::vector< double > &result) |
size_t | encodeSafe (const double *data, const size_t dataSize, unsigned char *result) |
size_t | decodeSafe (const unsigned char *data, const size_t dataSize, double *result) |
size_t | encodePic (const double *data, size_t dataSize, unsigned char *result) |
size_t | decodePic (const unsigned char *data, const size_t dataSize, double *result) |
void | encodePic (const std::vector< double > &data, std::vector< unsigned char > &result) |
void | decodePic (const std::vector< unsigned char > &data, std::vector< double > &result) |
double | optimalSlofFixedPoint (const double *data, size_t dataSize) |
size_t | encodeSlof (const double *data, size_t dataSize, unsigned char *result, double fixedPoint) |
size_t | decodeSlof (const unsigned char *data, const size_t dataSize, double *result) |
void | encodeSlof (const std::vector< double > &data, std::vector< unsigned char > &result, double fixedPoint) |
void | decodeSlof (const std::vector< unsigned char > &data, std::vector< double > &result) |
Variables | |
const int | ONE = 1 |
bool | IS_LITTLE_ENDIAN = is_little_endian() |
|
static |
Definition at line 62 of file MSNumpress.cpp.
References IS_LITTLE_ENDIAN.
Referenced by decodeLinear(), and decodeSlof().
|
static |
Decodes an int from the half bytes in bp. Lossless reverse of encodeInt
data | ptr to the char data to decode |
di | position in the char data array to start decoding (will be advanced) |
max_di | size of data array |
half | helper variable (do not change between multiple calls) |
res | result (a 32 bit integer) |
Definition at line 151 of file MSNumpress.cpp.
Referenced by decodeLinear(), and decodePic().
void ms::numpress::MSNumpress::decodeLinear | ( | const std::vector< unsigned char > & | data, |
std::vector< double > & | result | ||
) |
Calls lower level decodeLinear while handling vector sizes appropriately
Note that this method may throw a const char* if it deems the input data to be corrupt, i.e.. that the last encoded int does not use the last byte in the data. In addition the last encoded int need to use either the last halfbyte, or the second last followed by a 0x0 halfbyte.
@data vector of bytes to be decoded
Definition at line 451 of file MSNumpress.cpp.
References decodeLinear().
size_t ms::numpress::MSNumpress::decodeLinear | ( | const unsigned char * | data, |
const size_t | dataSize, | ||
double * | result | ||
) |
Decodes data encoded by encodeLinear.
result vector guaranteed to be shorter or equal to (|data| - 8) * 2
Note that this method may throw a const char* if it deems the input data to be corrupt, i.e. that the last encoded int does not use the last byte in the data. In addition the last encoded int need to use either the last halfbyte, or the second last followed by a 0x0 halfbyte.
@data pointer to array of bytes to be decoded (need memorycont. repr.) @dataSize number of bytes from *data to decode
Definition at line 361 of file MSNumpress.cpp.
References decodeFixedPoint(), and decodeInt().
Referenced by decodeLinear(), decodeLinearCorrupt1(), decodeLinearCorrupt2(), decodeLinearNice(), decodeLinearNiceLowFP(), decodeLinearWierd(), encodeDecodeLinear(), encodeDecodeLinear5(), and encodeDecodeLinearStraight().
void ms::numpress::MSNumpress::decodePic | ( | const std::vector< unsigned char > & | data, |
std::vector< double > & | result | ||
) |
Calls lower level decodePic while handling vector sizes appropriately
Note that this method may throw a const char* if it deems the input data to be corrupt, i.e. that the last encoded int does not use the last byte in the data. In addition the last encoded int need to use either the last halfbyte, or the second last followed by a 0x0 halfbyte.
@data vector of bytes to be decoded
Definition at line 665 of file MSNumpress.cpp.
References decodePic().
size_t ms::numpress::MSNumpress::decodePic | ( | const unsigned char * | data, |
const size_t | dataSize, | ||
double * | result | ||
) |
Decodes data encoded by encodePic
result vector guaranteed to be shorter of equal to |data| * 2
Note that this method may throw a const char* if it deems the input data to be corrupt, i.e. that the last encoded int does not use the last byte in the data. In addition the last encoded int need to use either the last halfbyte, or the second last followed by a 0x0 halfbyte.
@data pointer to array of bytes to be decoded (need memorycont. repr.) @dataSize number of bytes from *data to decode
Definition at line 617 of file MSNumpress.cpp.
References decodeInt().
Referenced by decodePic(), encodeDecodePic(), encodeDecodePic5(), and testErroneousDecodePic().
size_t ms::numpress::MSNumpress::decodeSafe | ( | const unsigned char * | data, |
const size_t | dataSize, | ||
double * | result | ||
) |
Decodes data encoded by encodeSafe.
result vector is the same size as the input data.
Might throw const char* is something goes wrong during decoding.
@data pointer to array of bytes to be decoded (need memorycont. repr.) @dataSize number of bytes from *data to decode
Definition at line 510 of file MSNumpress.cpp.
References IS_LITTLE_ENDIAN.
Referenced by encodeDecodeSafe(), and encodeDecodeSafeStraight().
void ms::numpress::MSNumpress::decodeSlof | ( | const std::vector< unsigned char > & | data, |
std::vector< double > & | result | ||
) |
Calls lower level decodeSlof while handling vector sizes appropriately
Note that this method may throw a const char* if it deems the input data to be corrupt.
@data vector of bytes to be decoded
Definition at line 770 of file MSNumpress.cpp.
References decodeSlof().
size_t ms::numpress::MSNumpress::decodeSlof | ( | const unsigned char * | data, |
const size_t | dataSize, | ||
double * | result | ||
) |
Decodes data encoded by encodeSlof
The return will include exactly (|data| - 8) / 2 doubles.
Note that this method may throw a const char* if it deems the input data to be corrupt.
@data pointer to array of bytes to be decoded (need memorycont. repr.) @dataSize number of bytes from *data to decode
Definition at line 733 of file MSNumpress.cpp.
References decodeFixedPoint().
Referenced by decodeSlof(), encodeDecodeSlof(), and encodeDecodeSlof5().
|
static |
Definition at line 49 of file MSNumpress.cpp.
References IS_LITTLE_ENDIAN.
Referenced by encodeLinear(), and encodeSlof().
|
static |
Encodes the int x as a number of halfbytes in res. res_length is incremented by the number of halfbytes, which will be 1 <= n <= 9
Definition at line 83 of file MSNumpress.cpp.
Referenced by encodeLinear(), and encodePic().
size_t ms::numpress::MSNumpress::encodeLinear | ( | const double * | data, |
const size_t | dataSize, | ||
unsigned char * | result, | ||
double | fixedPoint | ||
) |
Encodes the doubles in data by first using a
The resulting binary is maximally 8 + dataSize * 5 bytes, but much less if the data is reasonably smooth on the first order.
This encoding is suitable for typical m/z or retention time binary arrays. On a test set, the encoding was empirically show to be accurate to at least 0.002 ppm.
@data pointer to array of double to be encoded (need memorycont. repr.) @dataSize number of doubles from *data to encode
Definition at line 270 of file MSNumpress.cpp.
References encodeFixedPoint(), encodeInt(), and THROW_ON_OVERFLOW.
Referenced by decodeLinearCorrupt2(), decodeLinearNice(), decodeLinearNiceLowFP(), decodeLinearWierd(), decodeLinearWierd_int_overflow(), decodeLinearWierd_int_underflow(), decodeLinearWierd_llong_overflow(), encodeDecodeLinear(), encodeDecodeLinear5(), encodeDecodeLinearStraight(), encodeLinear(), and encodeLinear1().
void ms::numpress::MSNumpress::encodeLinear | ( | const std::vector< double > & | data, |
std::vector< unsigned char > & | result, | ||
double | fixedPoint | ||
) |
Calls lower level encodeLinear while handling vector sizes appropriately
@data vector of doubles to be encoded
Definition at line 438 of file MSNumpress.cpp.
References encodeLinear().
size_t ms::numpress::MSNumpress::encodePic | ( | const double * | data, |
const size_t | dataSize, | ||
unsigned char * | result | ||
) |
Encodes ion counts by simply rounding to the nearest 4 byte integer, and compressing each integer with encodeInt.
The handleable range is therefore 0 -> 4294967294. The resulting binary is maximally dataSize * 5 bytes, but much less if the data is close to 0 on average.
@data pointer to array of double to be encoded (need memorycont. repr.) @dataSize number of doubles from *data to encode
Definition at line 568 of file MSNumpress.cpp.
References encodeInt(), and THROW_ON_OVERFLOW.
Referenced by encodeDecodePic(), encodeDecodePic5(), and encodePic().
void ms::numpress::MSNumpress::encodePic | ( | const std::vector< double > & | data, |
std::vector< unsigned char > & | result | ||
) |
Calls lower level encodePic while handling vector sizes appropriately
@data vector of doubles to be encoded
Definition at line 653 of file MSNumpress.cpp.
References encodePic().
size_t ms::numpress::MSNumpress::encodeSafe | ( | const double * | data, |
const size_t | dataSize, | ||
unsigned char * | result | ||
) |
Encodes the doubles in data by storing the residuals from a linear prediction after first two values.
The resulting binary is the same size as the input data.
This encoding is suitable for typical m/z or retention time binary arrays, and is intended to be used before zlib compression to improve compression.
@data pointer to array of doubles to be encoded (need memorycont. repr.) @dataSize number of doubles from *data to encode
Definition at line 464 of file MSNumpress.cpp.
References IS_LITTLE_ENDIAN.
Referenced by encodeDecodeSafe(), and encodeDecodeSafeStraight().
size_t ms::numpress::MSNumpress::encodeSlof | ( | const double * | data, |
const size_t | dataSize, | ||
unsigned char * | result, | ||
double | fixedPoint | ||
) |
Encodes ion counts by taking the natural logarithm, and storing a fixed point representation of this. This is calculated as
unsigned short fp = log(d + 1) * fixedPoint + 0.5
the result vector is exactly |data| * 2 + 8 bytes long
@data pointer to array of double to be encoded (need memorycont. repr.) @dataSize number of doubles from *data to encode
Definition at line 704 of file MSNumpress.cpp.
References encodeFixedPoint(), and THROW_ON_OVERFLOW.
Referenced by encodeDecodeSlof(), encodeDecodeSlof5(), and encodeSlof().
void ms::numpress::MSNumpress::encodeSlof | ( | const std::vector< double > & | data, |
std::vector< unsigned char > & | result, | ||
double | fixedPoint | ||
) |
Calls lower level encodeSlof while handling vector sizes appropriately
@data vector of doubles to be encoded
Definition at line 757 of file MSNumpress.cpp.
References encodeSlof().
|
static |
Definition at line 40 of file MSNumpress.cpp.
References ONE.
double ms::numpress::MSNumpress::optimalLinearFixedPoint | ( | const double * | data, |
size_t | dataSize | ||
) |
Compute the maximal linear fixed point that prevents integer overflow.
@data pointer to array of double to be encoded (need memorycont. repr.) @dataSize number of doubles from *data to encode
Definition at line 234 of file MSNumpress.cpp.
Referenced by decodeLinearCorrupt2(), decodeLinearNice(), decodeLinearWierd(), encodeDecodeLinear(), encodeDecodeLinear5(), encodeDecodeLinearStraight(), and optimalLinearFixedPoint().
double ms::numpress::MSNumpress::optimalLinearFixedPointMass | ( | const double * | data, |
size_t | dataSize, | ||
double | mass_acc | ||
) |
Compute the optimal linear fixed point with a desired m/z accuracy.
@data pointer to array of double to be encoded (need memorycont. repr.) @dataSize number of doubles from *data to encode @mass_acc desired m/z accuracy in Th
Definition at line 213 of file MSNumpress.cpp.
References optimalLinearFixedPoint().
Referenced by decodeLinearNiceLowFP(), and optimalLinearFixedPointMass().
double ms::numpress::MSNumpress::optimalSlofFixedPoint | ( | const double * | data, |
size_t | dataSize | ||
) |
Definition at line 679 of file MSNumpress.cpp.
Referenced by encodeDecodeSlof(), encodeDecodeSlof5(), and optimalSlofFixedPoint().
bool ms::numpress::MSNumpress::IS_LITTLE_ENDIAN = is_little_endian() |
Definition at line 43 of file MSNumpress.cpp.
Referenced by decodeFixedPoint(), decodeSafe(), encodeFixedPoint(), and encodeSafe().
const int ms::numpress::MSNumpress::ONE = 1 |
Definition at line 39 of file MSNumpress.cpp.
Referenced by is_little_endian().