Opening Ceremony
[09:50 – 10:00] Opening the Workshop Anne-Cecilie Riiser (Norwegian Meteorological Institute) |
Meteorological Research
[10:00 – 10:30] XNow: a convolutional neural network model for weather nowcasting using radar data from the Romanian National Meteorological Administration
Andrei Mihai (Babeș-Bolyai University)
Radar data is heavily used for weather prediction and is crucial for issuing relevant warnings of severe weather. This presentation will focus on XNow, the model we developed to predict weather data, specifically radar data from the Transylvania region, provided by the National Meteorological Administration from Romania. We evaluated various types of neural network models for this purpose, based on different kinds of architectures, and the most favorable outcomes were provided by a model based on the Xception architecture. The presentation will cover the XNow architecture highlighting the differences from the Xception architecture, the training methodology and data model used for radar data prediction, as well as the modifications implemented for making predictions over increasingly longer timeframes.
[10:30 – 11:00] A convolutional neural network approach for weather forecasting using radar data
Alexandra Albu (Babeș-Bolyai University)
Weather forecasting is a challenging task due to the fact that it requires providing fast and accurate predictions for severe weather phenomena on high resolution maps. Current deep learning approaches, however, struggle to predict extreme weather events due to the limited availability of training data characterizing such phenomena. An additional limitation is the fact that the models tend to produce blurry predictions, especially for time steps far in the future. We present a new convolutional deep learning architecture for weather forecasting based on the ResNeXt model. The proposed approach is evaluated on radar data from Romania and Norway and a comparison with related work approaches is presented. Moreover, an analysis of the impact of using as input different time windows in the past is performed. In addition, we discuss directions for improving the visual quality of the predicted radar images, as well as for enhancing the performance of the model on severe events.
[11:00 – 11:30] Enhancing the performance of quantitative precipitation estimation using ensemble of machine learning models applied on weather radar data
Eugen Mihuleț (Romanian National Meteorological Administration)
We present the use of machine learning techniques in order to improve radar-based rainfall estimates for early warning systems. Using reflectivity data from a WSR-98D weather radar of the Romanian Meteorological Administration, we present a proof of concept study that evaluates six machine learning models for estimating hourly accumulated rainfall. Our findings show that a stacked machine learning model outperforms the radar-estimated accumulated rainfall and the baseline computed using the Z-R relationship.
[11:30 – 12:00] The WeaMyL Annotated Atlas of Meteorological Observations
Abdelkader Mezghani (Norwegian Meteorological Institute)
We present The Annotated Atlas of Meteorological Observations that was developed within the WeaMyL research project. It consists of a web-based application for analysing weather warning (CAP) proposals by reviewing historical weather warnings and earlier situations and giving the earlier warnings hit rate grades. The Annotated Atlas platform also identifies underlying meteorological events from various data sources such as radar images, satellite images and ground based observations. This requires that all datasets have standard metadata that can be easily found in national metadata catalog services. The selected events are finally used as reference in training the machine learning algorithms for weather forecasting.
[12:00 – 12:30] The WeaMyL Forecasting Platform
Abdelkader Mezghani (Norwegian Meteorological Institute)
The weather forecasting platform developed within the WeaMyL project is shown from an IT and scientific perspective. The different model versions will be discussed and an evaluation and comparisons between them will be made. We will also demonstrate how to use the ML platform and how it is linked to the different WeaMyL and national weather services in Norway and Romania. The evaluation of the different models is based on events extracted from the Annotated Atlas of Meteorological Observations that are based on rainfall warning events. Those dates are regenerated and results are compared to radar observations. We used the composite reflectivity as a proxy to estimate rainfall.
Applications of Deep Learning
[13:30 – 14:00] Comparing one- and binary-class SVM-based software defect predictors
George Ciubotariu (Babeș-Bolyai University)
Software Defect Prediction is a relevant task that increasingly gains more interest as the programming industry expands. However, its difficulty consists in overcoming class imbalance issues, because most open-source software projects that are annotated using bug tracking means do not have lots of defects. Therefore, the rarity of bugs may often cause Machine Learning models to dramatically underperform, even when diverse data augmentation or selection methods are applied. As a result, our focus shifts towards unsupervised learning, namely One-Class Classification, which is a family of outlier detection algorithms, designed to be trained using only instances of a single label. Considering this approach, we adapt the traditional Support Vector Machine model to perform outlier detection on an Apache Calcite dataset. We name the resulting One-Class model OCSVM, and we add a “+”, or a “-” to the acronym, to specify the type of instances it has been trained on, namely positive or negative, respectively. Using a streamlined framework we developed, we perform a grid search approach that gives us more suitable models. The main findings consist in uncovering several trends in the SVM-based models’ behaviour when solving SDP problems.
[14:00 – 14:30] Video completion conditioned by natural language-based descriptions
Orășan Paul (Babeș-Bolyai University)
Recent breakthroughs in the study of Denoising Diffusion Models have constituted a driving factor in advancing the field of Computer Vision through the aid of generative Machine Learning. Challenging problems, such as image or video synthesis, become intertwined with Natural Language Processing techniques to push the boundaries of computational capabilities and foster human ingenuity. We present an approach to solving text-guided video completion, a difficult task of producing the temporal future progression of an image, conditioned by textual description. Our proposed framework applies the latest academic leaps in Text-to-Image and Text-to-Video generation to perform Image Animation guided by text, an insufficiently explored topic due to its strenuous nature. We show how adapting generalised pre-trained Text-to-Image models can achieve good performance solving the described problem in a Zero-Shot context.
[14:30 – 15:00] Enhancing the performance of software authorship attribution using deep autoencoders
Briciu Anamaria (Babeș-Bolyai University)
Efficient authorship attribution systems can bring improvements to software engineering processes such as software maintenance or software quality assurance, as well as plagiarism detection, with particular relevance in copyright and licensing issues. While there is extensive research on software authorship attribution, a relatively small number of works investigate the relevance of natural language processing-inspired representations for source code. This presentation focuses on two autoencoder-based approaches for the task of software authorship attribution: a supervised multi-class classification model, AutoSoft, composed of an ensemble of autoencoders that are trained to encode and recognize the programming style of individual software developers, and SoftId, an autoencoder-based one-class classification model that is able to detect if a certain source code is authored by a developer from a given set of developers, set on which the model is trained, or by an “unknown” software developer. For both of these models, natural language processing-inspired representations of source code are used. The results obtained in the experiments empirically prove that the proposed representations capture relevant information about how developers are writing their code, and the autoencoders’ ability to encode meaningful data patterns makes them a suitable technique for solving the authorship attribution task.
[15:00 – 15:30] Automatic code generation for malware detection based on MITRE ATT&CK techniques
Sîrbu Alexandru-Gabriel (Babeș-Bolyai University)
After the rise of language models that can solve a variety of problems, such as ChatGPT, developers have used them in order to become more efficient when writing code, even though their generated code may be prone to errors, such as syntax errors, rendering that code sometimes unusable. Thus, we propose a syntax-error-free generator model, which creates an Abstract Syntax Tree based on a natural language description, which is later translated into code. Applications of such a model can be vast, but our focus will be on building a tool used by developers in order to create faster and better means of protection against malware, based on MITRE ATT&CK technique descriptions. These descriptions are open-source details of generic malware behaviour, which have become a cybersecurity industry standard in the matter of detections.
[15:30 – 16:00] Finding Cooperation in the N-Player Iterated Prisoner’s Dilemma with Deep Reinforcement Learning Over Dynamic Complex Networks
Imre Mali (Babeș-Bolyai University)
Biological, social and economical systems expose enormous levels of complexity, and studying situations of cooperation of conflict encompassing such systems is of particular interest. The N-Player Iterated Prisoner’s Dilemma (NIPD) is a general game-theoretic model that captures realistic scenarios of cooperation and conflict. We are studying the emergence of cooperation in the NIPD, using methods borrowed from the field of reinforcement learning. Such methods attracted immense amounts of attention over the last decade, and have shown promising results in robotics, game-playing, multi-agent systems, etc. However, it is well-known that plain reinforcement learning applied in the NIPD will converge to the Nash equilibrium of the game, which is defection. Therefore, additional mechanisms are needed to foster the development and upkeep of cooperation. This work considers different interconnection topologies for players of the NIPD, and uses dynamic rewiring (players can sever some connections and form others) as a mechanism to encourage cooperation. In such scenarios, the initial interconnection topology is also a crucial aspect.