0% found this document useful (0 votes)

83 views

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Eigen is a C++ template library for linear algebra that provides simple interfaces and good performance. It uses expression templates and lazy evaluation to avoid unnecessary temporary objects and vectorizes operations using SIMD instructions. Eigen can handle dense and sparse matrices and vectors and includes linear algebra algorithms, geometry functions, and other features. Benchmarks show it performs comparably to optimized libraries like MKL and GotoBLAS.

Uploaded by

mizzlez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

mizzlez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

Eigen: A C++ Linear

Algebra Template Library

Md Ashiqur Rahman
Outline
● Introduction & Motivation
● How it works
● Implementation of Eigen
- Expression templates, Lazy evaluation, Vectorization
● Aliasing problems
● Platforms
● Eigen vs BLAS/Lapack
● Benchmark
● Conclusion
Introduction
● A C++ template library for linear algebra

● Header only, nothing to install or compile

● Provide good speed, simple interface and use

● Opensource
Why Another Library
● Multiplatform and Good compiler support
● A single unified library
● Most libraries specialized in one of the features or module
● Eigen satisfy all these criteria

-free, fast, versatile, reliable, decent API, support for both sparse and
dense matrices, vectors and array, linear algebra algorithms (LU, QR, ...),
geometric transformations.
How it works
● Takes 3 compulsory and 3 optional arguments
Matrix<typename Scalar,

int RowsAtCompileTime,

int ColsAtCompileTime,
int Options = 0,

int MaxRowsAtCompileTime = RowsAtCompileTime,

int MaxColsAtCompileTime = ColsAtCompileTime>

● Could be different types

typedef Matrix<float, 4, 4> Matrix4f;
typedef Matrix<double, Dynamic, Dynamic> MatrixXd;
typedef Matrix<float, 3, 1> Vector3f;

typedef Matrix<int, 1, 2> RowVector2i;

Eigen Implementation: 1D array
● Simple matrix addition example
int size = 50;
Eigen::VectorXf u(size), v(size), w(size);
u = v + w;

● Use one dimensional array, one loop to traverse the array

for(int i = 0; i < 50; ++i)
u[i] = v[i] + w[i];
Eigen Implementation: use expression template
● Addition should be done using temporary object
VectorXf tmp = v + w;
VectorXf u = tmp;
for(int i = 0; i < size; i++) tmp[i] = v[i] + w[i];
for(int i = 0; i < size; i++) u[i] = tmp[i];

● Eigen uses expression template to prevent unnecessary use of temporary

objects.
for(int i = 0; i < size; i++) u[i] = v[i] + w[i];
Eigen Implementation: lazy evaluation
● Intelligent lazy evaluation of expressions.

● Exceptions:
- Matrix product
- Nested expressions

matrix1 = matrix2 + matrix3 * matrix4;

- If cost model results to choose immediate evaluation

matrix1 = matrix2 * (matrix3 + matrix4);

Eigen Implementation: lazy or immediate evaluation
● Assignment operator implementation (=)
template<typename Derived>
template<typename OtherDerived>
inline Derived& MatrixBase<Derived>
::operator=(const MatrixBase<OtherDerived>& other)
{
return internal::assign_selector<Derived,OtherDerived>::run(derived(), other.derived());
}

● Internal::assign_selector
template<typename Derived, typename OtherDerived,
bool EvalBeforeAssigning = int(OtherDerived::Flags) & EvalBeforeAssigningBit,
bool NeedToTranspose = Derived::IsVectorAtCompileTime
&& OtherDerived::IsVectorAtCompileTime
&& int(Derived::RowsAtCompileTime) == int(OtherDerived::ColsAtCompileTime)
&& int(Derived::ColsAtCompileTime) == int(OtherDerived::RowsAtCompileTime)
&& int(Derived::SizeAtCompileTime) != 1>
struct internal::assign_selector;
Eigen Implementation: Automatic vectorization
● Does automatic vectorization by itself, not compiler dependent.

● Different vectorization for different architecture

● SIMD instruction sets SSE2, AltiVect, ARM NEON

Eigen Implementation: Automatic vectorization
● SSE, NEON works with 16 bytes packets.

● 4 floats or ints or 2 doubles per packets.

● 4 Addition per packets

● Our vector size 50,

for(int i = 0; i < 4*(size/4); i+=4) u.packet(i) = v.packet(i) + w.packet(i);
for(int i = 4*(size/4); i < size; i++) u[i] = v[i] + w[i];
Eigen Implementation: which vectorization to use
● Implemented in an helper class internal::assign_traits
enum {
StorageOrdersAgree = (int(Derived::IsRowMajor) == int(OtherDerived::IsRowMajor)),
MightVectorize = StorageOrdersAgree && (int(Derived::Flags) & int(OtherDerived::Flags) & ActualPacketAccessBit),
MayInnerVectorize = MightVectorize && int(InnerSize)!=Dynamic && int(InnerSize)%int(PacketSize)==0
&& int(DstIsAligned) && int(SrcIsAligned),
MayLinearize = StorageOrdersAgree && (int(Derived::Flags) & int(OtherDerived::Flags) & LinearAccessBit),
MayLinearVectorize = MightVectorize && MayLinearize && DstHasDirectAccess && (DstIsAligned || MaxSizeAtCompileTime==Dynamic),
MaySliceVectorize = MightVectorize && DstHasDirectAccess && (int(InnerMaxSize)==Dynamic || int(InnerMaxSize)>=3*PacketSize)
};
Eigen Implementation: Linear Vectorization implementation
● Need to skip first few coefficients to group coefficients by packets of 4.
● First, determine architecture specific packet size
const int packetSize = internal::packet_traits<typename Derived1::Scalar>::size;

● Start of first coefficient

const int alignedStart = internal::assign_traits<Derived1,Derived2>::DstIsAligned ? 0 :
internal::first_aligned(&dst.coeffRef(0), size);

● Skipping coefficients

for(int index = 0; index < alignedStart; index++)

dst.copyCoeff(index, src);
Eigen Implementation: Linear Vectorization implementation
● Vector size 50 is not multiple of packet size 4 floats, 48 is the maximum
number.
const int alignedEnd = alignedStart + ((size-alignedStart)/packetSize)*packetSize;

● Vectorization part
for(int index = alignedStart; index < alignedEnd; index += packetSize)
{
dst.template copyPacket<Derived2, Aligned, internal::assign_traits<Derived1,Derived2>::
SrcAlignment>(index, src);
}

● Last two coefficients

for(int index = alignedEnd; index < size; index++)
dst.copyCoeff(index, src);
Aliasing Problem
● Occurs when a matrix operation applied on a matrix and saved in the
same matrix.
mat = mat.transpose();

● Produce wrong results.

● Solution is to use temporary variable

tmp = mat.transpose();
mat = tmp;
Platforms
● Supported compilers:

– GCC (from 3.4 to 4.6) , MSVC (2005,2008,2010) , Intel ICC, Clang/LLVM

● Supported systems:

– x86/x86_64 (Linux,Windows)

– ARM (Linux), PowerPC

● Supported SIMD vectorization engines:

– SSE2, SSE3, SSSE3, SSE4

– NEON (ARM)

– Altivec (PowerPC)
Eigen vs BLAS/Lapack
● Fixed size matrices, vectors
● Sparse matrices and vectors
● More features like Geometry module, Array module
● Most operations are faster or comparable with MKL and GOTO
● Better API
● Complex operations are faster
Benchmark
Benchmark
Conclusion
● From benchmark it shows, eigen is comparable with most linear algebra
library available.

● Simple interface make it more attractive

● Low memory overhead

● All features and modules in a single library make it more usable.

Theories Ethical School of Thoughts Tina2
No ratings yet
Theories Ethical School of Thoughts Tina2
49 pages
Lab4
No ratings yet
Lab4
3 pages
ECE OOP Lab2
No ratings yet
ECE OOP Lab2
3 pages
ECE OOP Lab4
No ratings yet
ECE OOP Lab4
4 pages
Eigen
No ratings yet
Eigen
12 pages
Object Oriented Programming Lab: Department of Computer Science and Engineering
No ratings yet
Object Oriented Programming Lab: Department of Computer Science and Engineering
46 pages
C++ Vs Fortran
No ratings yet
C++ Vs Fortran
10 pages
C++ For Scientific Computing: Mark Richardson May 2009
No ratings yet
C++ For Scientific Computing: Mark Richardson May 2009
51 pages
b22cs028 Rakesh Assignment-4
No ratings yet
b22cs028 Rakesh Assignment-4
6 pages
PA2
No ratings yet
PA2
7 pages
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
BCSL-032 (2023-24) Solved Assignment
No ratings yet
BCSL-032 (2023-24) Solved Assignment
11 pages
Oops Lab Manual
No ratings yet
Oops Lab Manual
41 pages
Oops Lab Manual
No ratings yet
Oops Lab Manual
47 pages
Lab6 - Linear Algebra in C On A Microcontroller
No ratings yet
Lab6 - Linear Algebra in C On A Microcontroller
8 pages
Blocked Matrix Multiply
No ratings yet
Blocked Matrix Multiply
6 pages
Transitioning To Modern C++:: An Overview of C++11/14/17 For C++98 Programmers
No ratings yet
Transitioning To Modern C++:: An Overview of C++11/14/17 For C++98 Programmers
6 pages
C Programming
From Everand
C Programming
Netra
No ratings yet
HPC-Practical-4Addition of two large vectors
No ratings yet
HPC-Practical-4Addition of two large vectors
4 pages
Matrix Math Packages User's Guide
No ratings yet
Matrix Math Packages User's Guide
10 pages
Matrix Math Packages User's Guide
No ratings yet
Matrix Math Packages User's Guide
10 pages
Problem Set 1-1
No ratings yet
Problem Set 1-1
8 pages
CS201P Assignment 2 Solution Spring 2022
No ratings yet
CS201P Assignment 2 Solution Spring 2022
8 pages
Matrix Computation On The GPU
No ratings yet
Matrix Computation On The GPU
455 pages
Matrix: Remark: The Created Matrices Are Square Matrices. (Indicated by The Number: 2, 3, 4)
No ratings yet
Matrix: Remark: The Created Matrices Are Square Matrices. (Indicated by The Number: 2, 3, 4)
10 pages
ST7 SHP 1.3 ExOptimVectoSIMD 1spp
No ratings yet
ST7 SHP 1.3 ExOptimVectoSIMD 1spp
21 pages
mat multipli
No ratings yet
mat multipli
4 pages
ProblemSheets2015 Solutions
No ratings yet
ProblemSheets2015 Solutions
202 pages
Java Metode Gauss Jordan
No ratings yet
Java Metode Gauss Jordan
7 pages
Problem Statement
No ratings yet
Problem Statement
5 pages
20 Quiz 14
No ratings yet
20 Quiz 14
12 pages
Proj 2
No ratings yet
Proj 2
3 pages
Program: / Implementing Class With Static Data Member
No ratings yet
Program: / Implementing Class With Static Data Member
49 pages
Oop Practicals 1-14
No ratings yet
Oop Practicals 1-14
36 pages
OOPs Practical Final
No ratings yet
OOPs Practical Final
27 pages
CS2209 - Oops Lab Manual
100% (1)
CS2209 - Oops Lab Manual
62 pages
HPC Unit 5 b
No ratings yet
HPC Unit 5 b
31 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
KT 14503 Mathematics For Computing Group Assignment 20 Marks
No ratings yet
KT 14503 Mathematics For Computing Group Assignment 20 Marks
7 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
2 pages
Matrices
No ratings yet
Matrices
10 pages
Computer Science Discrete Mathematics
No ratings yet
Computer Science Discrete Mathematics
11 pages
LatexC++ Proposed Exercises (Chapter 7: The C++ Programing Language, Fourth Edition) - Solution
No ratings yet
LatexC++ Proposed Exercises (Chapter 7: The C++ Programing Language, Fourth Edition) - Solution
7 pages
Unit 2 Basic Optimization Techniques For Serial Code
No ratings yet
Unit 2 Basic Optimization Techniques For Serial Code
31 pages
Basic Information About C language PDF
From Everand
Basic Information About C language PDF
Suraj Das
No ratings yet
COL380 Assignment 1
No ratings yet
COL380 Assignment 1
10 pages
Signals and Systems Lab 1
No ratings yet
Signals and Systems Lab 1
14 pages
Matrix-Matrix Operations: 5.1 Opening Remarks
No ratings yet
Matrix-Matrix Operations: 5.1 Opening Remarks
44 pages
solutions_PAST PAPERS SOFTWARE DEV 2
No ratings yet
solutions_PAST PAPERS SOFTWARE DEV 2
5 pages
MATH3322 2 Basic Linear Algebra and Matrix Operations
No ratings yet
MATH3322 2 Basic Linear Algebra and Matrix Operations
21 pages
WORKSHEET For Practical 2.2
No ratings yet
WORKSHEET For Practical 2.2
16 pages
Chapter 4: Matrix and Vector Operations 29
No ratings yet
Chapter 4: Matrix and Vector Operations 29
8 pages
Lab Report # 1: Signals & Systems EEE-223
No ratings yet
Lab Report # 1: Signals & Systems EEE-223
14 pages
Program Rumah Makan
No ratings yet
Program Rumah Makan
5 pages
Mathematics Group Assignment
No ratings yet
Mathematics Group Assignment
25 pages
Matrices: - Matrix Is 2-D Array of M Rows by N Columns
No ratings yet
Matrices: - Matrix Is 2-D Array of M Rows by N Columns
14 pages
CS-114 Fundamentals of Programming (2+1) DE-41 EE Semester 1 Fall 2019
No ratings yet
CS-114 Fundamentals of Programming (2+1) DE-41 EE Semester 1 Fall 2019
4 pages
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Matrices
No ratings yet
Matrices
10 pages
Chapter 4 Solutions: Case Study: Implementing A Vector Kernel On A Vector Processor and GPU
No ratings yet
Chapter 4 Solutions: Case Study: Implementing A Vector Kernel On A Vector Processor and GPU
12 pages
Chapter 04
No ratings yet
Chapter 04
12 pages
Design Project
100% (1)
Design Project
66 pages
Methods of Studying Learners Behaviours
No ratings yet
Methods of Studying Learners Behaviours
55 pages
HBO SemiFinals
No ratings yet
HBO SemiFinals
6 pages
Mosier Digital Citizenship 42slides
No ratings yet
Mosier Digital Citizenship 42slides
42 pages
TLC 1514
No ratings yet
TLC 1514
40 pages
How To Prepare PPSC Lecturer Test
No ratings yet
How To Prepare PPSC Lecturer Test
3 pages
7.professionalism and Social Media
No ratings yet
7.professionalism and Social Media
20 pages
The Old Man and The Sea
100% (4)
The Old Man and The Sea
4 pages
Fäcke - 2014 - Manual of Language Acquisition
No ratings yet
Fäcke - 2014 - Manual of Language Acquisition
640 pages
Hose Machine Group Set-Up and Operation (0599, 0684)
No ratings yet
Hose Machine Group Set-Up and Operation (0599, 0684)
31 pages
DCD Lab Manual
No ratings yet
DCD Lab Manual
49 pages
Answers To Saqs: Cambridge International A Level Physics
No ratings yet
Answers To Saqs: Cambridge International A Level Physics
2 pages
E-receipt3
No ratings yet
E-receipt3
2 pages
Lesson plan
No ratings yet
Lesson plan
3 pages
Personality Development Grooming February 11 to 21 2025
No ratings yet
Personality Development Grooming February 11 to 21 2025
1 page
SpiraxSarco-B5-Basic Control Theory
100% (2)
SpiraxSarco-B5-Basic Control Theory
74 pages
Association of Autonomous Astronauts Zine
100% (1)
Association of Autonomous Astronauts Zine
44 pages
CSM Platetectonics Activity1 Worksheet v3 Tedl DWC 2
No ratings yet
CSM Platetectonics Activity1 Worksheet v3 Tedl DWC 2
3 pages
Solutions Stat CH 7
No ratings yet
Solutions Stat CH 7
6 pages
Geologist Cover Letter
100% (3)
Geologist Cover Letter
4 pages
Nissan RB Engine - Wikipedia
100% (3)
Nissan RB Engine - Wikipedia
60 pages
JPG 2 PDF
No ratings yet
JPG 2 PDF
24 pages
Academic Planning and Services: Table of Specifications (Tos)
No ratings yet
Academic Planning and Services: Table of Specifications (Tos)
1 page
Car Basic Terminologies
No ratings yet
Car Basic Terminologies
58 pages
Work Life Balance Policies, Practices and Its Impact On Organizational Performance
No ratings yet
Work Life Balance Policies, Practices and Its Impact On Organizational Performance
11 pages
Tesla 12
0% (1)
Tesla 12
34 pages
Date: Lesson/s: Learning Competencies: Objectives:: Subject: Contact Center Services, National Certificate (NC Ii)
No ratings yet
Date: Lesson/s: Learning Competencies: Objectives:: Subject: Contact Center Services, National Certificate (NC Ii)
4 pages
Cost Accounting
No ratings yet
Cost Accounting
8 pages
DJJ40163 LABSHEET Balancing Same Plane
No ratings yet
DJJ40163 LABSHEET Balancing Same Plane
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

Eigen: A C++ Linear

Algebra Template Library

● Header only, nothing to install or compile

● Provide good speed, simple interface and use

int MaxRowsAtCompileTime = RowsAtCompileTime,

● Could be different types

typedef Matrix<int, 1, 2> RowVector2i;

● Use one dimensional array, one loop to traverse the array

● Eigen uses expression template to prevent unnecessary use of temporary

matrix1 = matrix2 + matrix3 * matrix4;

- If cost model results to choose immediate evaluation

matrix1 = matrix2 * (matrix3 + matrix4);

● Different vectorization for different architecture

● SIMD instruction sets SSE2, AltiVect, ARM NEON

● 4 floats or ints or 2 doubles per packets.

● 4 Addition per packets

● Our vector size 50,

● Start of first coefficient

for(int index = 0; index < alignedStart; index++)

● Last two coefficients

● Produce wrong results.

● Solution is to use temporary variable

– GCC (from 3.4 to 4.6) , MSVC (2005,2008,2010) , Intel ICC, Clang/LLVM

– ARM (Linux), PowerPC

● Supported SIMD vectorization engines:

– SSE2, SSE3, SSSE3, SSE4

● Simple interface make it more attractive

● Low memory overhead

● All features and modules in a single library make it more usable.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

Eigen: A C++ Linear Algebra Template Library: MD Ashiqur Rahman

Uploaded by

Eigen: A C++ Linear

Algebra Template Library

● Header only, nothing to install or compile

● Provide good speed, simple interface and use

int MaxRowsAtCompileTime = RowsAtCompileTime,

● Could be different types

typedef Matrix<int, 1, 2> RowVector2i;​

● Use one dimensional array, one loop to traverse the array

● Eigen uses expression template to prevent unnecessary use of temporary

matrix1 = matrix2 + matrix3 * matrix4;

- If cost model results to choose immediate evaluation

matrix1 = matrix2 * (matrix3 + matrix4);

● Different vectorization for different architecture

● SIMD instruction sets SSE2, AltiVect, ARM NEON

● 4 floats or ints or 2 doubles per packets.

● 4 Addition per packets

● Our vector size 50,

● Start of first coefficient

for(int index = 0; index < alignedStart; index++)

● Last two coefficients

● Produce wrong results.

● Solution is to use temporary variable

– GCC (from 3.4 to 4.6) , MSVC (2005,2008,2010) , Intel ICC, Clang/LLVM

– ARM (Linux), PowerPC

● Supported SIMD vectorization engines:

– SSE2, SSE3, SSSE3, SSE4

● Simple interface make it more attractive

● Low memory overhead

● All features and modules in a single library make it more usable.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

typedef Matrix<int, 1, 2> RowVector2i;