• 12th – 14th May
    As I am going to be using TIM (Tangent Information Modeller) to do the forecasting it seemed a good place to start so I have tried to digest as much information as I could. It is very different to using something like sci-kit learn or TensorFlow as there are no user forums or communities thatRead more “12th – 14th May”
  • 17th – 19th May
    Having played with TIM Studio and sample data there is a fair amount of research that I need to do – the language is very different from what we have learnt on the course – e.g. Backtesting which I think is the same as using a training set and test set. I am unsure atRead more “17th – 19th May”
  • 21st – 22nd May
    As visualisation is a big part of the project I have looked at Tableau, Power BI and Plotly/Dash libraries for Python. Both Tableau and Power BI have licencing costs whereas Plotly is free. Plotly allows for interactive dashboards in the same style as Power BI but gives greater flexibility in the final presentation and customisationRead more “21st – 22nd May”
  • 23rd – 25th May
    Having developed a workable initial dashboard for testing purposes using Plotly I have now explored the weather data from Keele using DEOPS. The data is as expected in different formats depending on whether you connect to the API or do a manual download. For the speed of developing a test pipeline, I have started withRead more “23rd – 25th May”
  • 26th – 30th May
    Most of this time has been spent understanding the TIM documentation. While the documentation for TIM is quite extensive, the Python API part of it is sparse, as is the documentation about the configuration settings (see below). The data needs to be formatted correctly for TIM to deal with it. This is easily done usingRead more “26th – 30th May”
  • 3rd – 5th June
    So I think I have worked out a way to solve the imputation problem. What I have done is written an iterative function that searches for blocks of missing data and uses previous complete data passed to the TIM API to predict what the missing values should be. It isn’t ideal but it is theRead more “3rd – 5th June”
  • 6th – 8th June
    I’ve spent time searching weather APIs. Having done some research I needed one that was specific to solar panel production. While typical weather services offer things like chance of rain, snow or temperature etc. these are not the most important when it comes to solar panel production. I narrowed it down to Solcast and ClimaCell,Read more “6th – 8th June”
  • 9th – 10th June
    Having now got a rough sample of data that can be passed to TIM I have written a prototype dashboard with interaction. Having used the Plotly and Dash libraries before I knew that I would use these. They really are on a par with tools like Power BI and Tableau for visualisation, and Pandas exceedsRead more “9th – 10th June”
  • 20th June
    Due to personal problems, I have had to take a break. It seemed like a good time to go over my pipeline and clean it. I also realised I hadn’t commented my code (or planned it properly) so there was a fair bit of refactoring and changing needed, but I do now have a cleanerRead more “20th June”
  • 21st – 26th June
    Mike from ST-Energy 360 really wants the model built and predict parts split into two separate parts. This doesn’t make sense to me as one of the selling points of TIM is real-time instant ML. However, I have to realise that he is the customer, and this is an internship, not just a placement. ThisRead more “21st – 26th June”
  • 27th June
    I had a meeting with Mike today to discuss the splitting of the model. He is worried about the compute costs of rebuilding the model every time. I showed him the demos that I had worked on and we realised that just the predict part using an existing model was using just as much computeRead more “27th June”
  • 28th June – 2nd July
    I have spent time trying to improve the dashboard. Mike wants accuracy metrics, which TIM provides, and favours MAPE as the easiest for a layperson to understand. I have tried to explain that MAPE (Mean Absolute Percentage Error) will not work in this scenario as the target variable (the production of solar panel) will veryRead more “28th June – 2nd July”
  • 5th – 7th July
    I have explored using a database model instead of CSV files for storing and retrieving data. I have done some experimenting and considered that the final application will probably be stored in two parts (backend and frontend) on a Cloud provider, probably Azure for development as I already have a student account with them. GivenRead more “5th – 7th July”
  • 19th – 22nd July
    After another break, through personal problems, I have caught up with Mike again. He seems reasonably happy with the dashboard currently but is concerned about the accuracy. This matches my own concern so I am glad to be working on the data side of things again. One of the things that I have been wantingRead more “19th – 22nd July”
  • 23rd – 27th July
    These past few days have involved a few small jobs that needed doing, including displaying data in a tabular format as well as a graph, ensuring we always have a full year worth of data to build a model on, and making sure the forecast is always 7 days from the time of the lastRead more “23rd – 27th July”
  • 29th July – 6th August
    In my supervisor meeting, we discussed deployment and it was suggested that the best way forward was to use Docker containers. I had heard of Docker but had absolutely no experience working with it, so have spent the last few days watching tutorials, setting up and deploying simple ‘Hello World’ type containers to Azure. Finally,Read more “29th July – 6th August”
  • 7th – 13th August
    I have on my own computer now split the app into two parts and have them talking to each other through accessing the same files. However, I need to work out the best way to do this on Azure with containers. I have experimented with Fastapi which would have an endpoint on the backend thatRead more “7th – 13th August”
  • 16th – 18th August
    Having not found a satisfactory way to use Azure File Storage, and the costs involved in using an API endpoint, I finally found the best solution for this project at its current scale, and that is Azure Blob Storage. This allows much easier access from a python script to both read and write data toRead more “16th – 18th August”
  • 23rd – 25th August
    Having got a plan now of what the architecture will look like I now need to implement it and ensure that all of the parts work, that is, the front end runs continuously and reads data from the Blob Storage, and the backend spins up at around 4am every day to update the data andRead more “23rd – 25th August”
  • 13th – 16th September
    I wanted to start with the backend ensuring first that it would run as a Docker container. I created the Dockerfile and ran the build. It probably took the best part of a day to get this deployed. I had errors because of certain base images that I chose, finally settling on a basic PythonRead more “13th – 16th September”
  • 18th – 23rd September
    Having uploaded the files to the blob storage I checked that the files could be read from there, which was as easy as using the endpoint of the file and reading it into a pandas dataframe. However, it wasn’t as straightforward to write a file. Eventually, I discovered the correct way to do it, whichRead more “18th – 23rd September”
  • 24th – 30th September
    Having got the backend running regularly I have now focused on the front end. After agreement from Mike, I have removed the metrics and put on a graph that shows the last seven days actuals vs predictions, as well as an interactive up to seven-day forecast, and the tabular data. This has been deployed andRead more “24th – 30th September”
  • 5th October
    In the end, it wasn’t a big accuracy problem but an error in the data pipeline that didn’t consider the granularity difference between DEOPS Systems and DEOPS API. This was a simple fix, and the new system now shows a much better correlation between actuals and predictions. There is only one thing left to doRead more “5th October”