Leverage scikit-learn’s composability to define pipelines as models
unique_id | ds | y | |
---|---|---|---|
0 | id_0 | 2000-01-01 | 0.428973 |
1 | id_0 | 2000-01-02 | 1.423626 |
2 | id_0 | 2000-01-03 | 2.311782 |
3 | id_0 | 2000-01-04 | 3.192191 |
4 | id_0 | 2000-01-05 | 4.148767 |
lag1 | dayofweek | |
---|---|---|
1 | 0.428973 | 6 |
2 | 1.423626 | 0 |
3 | 2.311782 | 1 |
4 | 3.192191 | 2 |
5 | 4.148767 | 3 |
dayofweek
column and perform one hot encoding, leaving the lag1
column untouched. We can achieve that with the following:
unique_id | ds | ohe_lr | |
---|---|---|---|
0 | id_0 | 2000-08-10 | 4.312748 |
1 | id_1 | 2000-04-07 | 4.537019 |
2 | id_2 | 2000-06-16 | 4.160505 |
3 | id_3 | 2000-08-30 | 3.777040 |
4 | id_4 | 2001-01-08 | 2.676933 |