Skip to content
View DeepPerf's full-sized avatar

Block or report DeepPerf

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
DeepPerf/README.md

DeepPerf

Many software systems provide users with a set of configuration options and different configurations may lead to different runtime performance of the system. It is necessary to understand the performance of a system under a certain configuration, before the system is actually configured and deployed. This helps users make rational decisions in configurations and reduce performance testing cost. As the combination of configurations could be exponential, it is difficult to exhaustively deploy and measure system performance under all possible configurations. Recently, several learning methods have been proposed to build a performance prediction model based on performance data collected from a small sample of configurations, and then use the model to predict system performance with a new configuration. DeepPerf is an end-to-end deep learning based solution that can train a software performance prediction model from a limited number of samples and predict the performance value of software system under a new configuration. DeepPerf consists of two main stages:

  • Stage 1: Tune the hyperparameters of the neural network
  • Stage 2: Utilize the hyperparameters in Stage 1 to train the neural network with the samples and predict the performance value of software system under a new configuration.

Citing DeepPerf

If you find our code useful, please cite our paper:

@inproceedings{Ha2019DeepPerf,
  author    = {Huong Ha and
               Hongyu Zhang},
  title     = {DeepPerf: performance prediction for configurable software with deep
               sparse neural network},
  booktitle = {Proceedings of the 41st International Conference on Software Engineering,
               {ICSE} 2019, Montreal, QC, Canada, May 25-31, 2019},
  pages     = {1095--1106},
  publisher = {{IEEE} / {ACM}},
  year      = {2019}
}

Prerequisites

  • Python 3.6.x
  • Tensorflow (tested with tensorflow 1.10.0, 1.8.0)

Installation

DeepPerf can be directly executed through source code

  1. Download and install Python 3.6.x here.

  2. Install Tensorflow

    $ pip install tensorflow==1.10.0

  3. Clone DeepPerf

    $ clone https://github.com/DeepPerf/DeepPerf.git

Data

DeepPerf has been evaluated on 11 real-world configurable software systems:

  • Apache
  • LLVM
  • x264
  • BDBC
  • BDBJ
  • SQL
  • Dune
  • hipacc
  • hsmgp
  • javagc
  • sac

Six of these systems have only binary configuration options, the other five systems have both binary and numeric configuration options. The data is store in the DeepPerf\Data directory. These software systems were measured and published online by the SPLConqueror team. More information of these systems and how they were measured can be found in here.

Usage

To run DeepPerf, users need to specify the name of the software system they wish to evaluate and then run the script AutoDeepPerf.py. There are 11 software systems that users can evaluate: Apache, LLVM, x264, BDBC, BDBJ, SQL, Dune, hipacc, hsmgp, javagc, sac. The script will then evaluate DeepPerf on the chosen software system with the same experiment setup presented in our paper. Specifically, for binary software systems, DeepPerf will run with five different sample sizes: n, 2n, 3n, 4n, 5n with n being the number of options, and 30 experiments for each sample size. For binary-numeric software systems, DeepPerf will run with the sample sizes specified in Table IV of our paper, and 30 experiments for each sample size. For example, if users want to evaluate DeepPerf with the system LLVM, the command line to run DeepPerf will be:

$ python AutoDeepPerf.py LLVM

When finishing each sample size, the script will output a .csv file that shows the mean prediction error and the margin (95% confidence interval) of that sample size over the 30 experiments. These results will be same/similar as the results we report in Table III and IV of our paper.

Alternatively, users can customize the sample size and/or the number of experiments for each sample size by using the optional arguments -ss and -ne. For example, to set the sample size = 20 and the number of experiments = 10, the corresponding command line is:

$ python AutoDeepPerf.py LLVM -ss 20 -ne 10

Setting none or one option will result in the other option(s) running with the default setting. The default setting of the number of experiments is 30. The default setting of the sample size is: (a) the five different sample sizes: n, 2n, 3n, 4n, 5n, with n being the number of configuration options, when the evaluated system is a binary system OR (b) the four sample sizes specified in Table IV of our paper when the evaluated system is a binary-numeric system.

NOTE: The time cost of tuning hyperparameters and training the final neural network for each experiment ranges from 2-20 minutes depends on the software system, the sample size and the user's CPU. Typically, the time cost will be smaller when the software systems has smaller number of configurations or when the sample size is small. Therefore, please be aware that for each sample size, the time cost of evaluating 30 experiments ranges from 1 hour to 10 hours.

Experimental Results

To evaluate the prediction accuracy, we use the mean relative error (MRE), which is computed as,

where V is the testing dataset, predicted_c is the predicted performance value of configuration c generated using the model, actual_c is the actual performance value of configuration c. In the two tables below, Mean is the mean of the MREs seen in 30 experiments and Margin is the margin of the 95% confidence interval of the MREs in the 30 experiments. The results are obtained when evaluating DeepPerf on a Windows 7 computer with Intel Xeon CPU E5-1650 3.2GHz 16GB RAM.

Prediction accuracy for software systems with binary options

Subject System Sample Size DECART DeepPerf
Mean Margin Mean Margin
Apache n NA NA 17.87 1.85
2n 15.83 2.89 10.24 1.15
3n 11.03 1.46 8.25 0.75
4n 9.49 1.00 6.97 0.39
5n 7.84 0.28 6.29 0.44
x264 n 17.71 3.87 10.43 2.28
2n 9.31 1.30 3.61 0.54
3n 6.37 0.83 2.13 0.31
4n 4.26 0.47 1.49 0.38
5n 2.94 0.52 0.87 0.11
BDBJ n 10.04 4.67 7.25 4.21
2n 2.23 0.16 2.07 0.32
3n 2.03 0.16 1.73 0.12
4n 1.72 0.09 1.67 0.12
5n 1.67 0.09 1.61 0.09
LLVM n 6.00 0.34 5.09 0.80
2n 4.66 0.47 3.87 0.48
3n 3.96 0.39 2.54 0.15
4n 3.54 0.42 2.27 0.16
5n 2.84 0.33 1.99 0.15
BDBC n 151.0 90.70 133.6 54.33
2n 43.8 26.72 16.77 2.25
3n 31.9 22.73 13.1 3.39
4n 6.93 1.39 6.95 1.11
5n 5.02 1.69 5.82 1.33
SQL n 4.87 0.22 5.04 0.32
2n 4.67 0.17 4.63 0.13
3n 4.36 0.09 4.48 0.08
4n 4.21 0.1 4.40 0.14
5n 4.11 0.08 4.27 0.13

Prediction accuracy for software systems with binary-numeric options

Subject System Sample Size SPLConqueror DeepPerf
Sampling Heuristic Mean Sampling Heuristic Mean Margin
Dune 49 OW RD 20.1 RD 15.73 0.90
78 PW RD 22.1 RD 13.67 0.82
240 OW PBD(49, 7) 10.6 RD 8.19 0.34
375 OW PBD(125, 5) 18.8 RD 7.20 0.17
hipacc 261 OW RD 14.2 RD 9.39 0.37
528 OW PBD(125, 5) 13.8 RD 6.38 0.44
736 OW PBD(49, 7) 13.9 RD 5.06 0.35
1281 PW RD 13.9 RD 3.75 0.26
hsmgp 77 OW RD 4.5 RD 6.76 0.87
173 PW RD 2.8 RD 3.60 0.2
384 OW PBD(49, 7) 2.2 RD 2.53 0.13
480 OW PBD(125, 5) 1.7 RD 2.24 0.11
javagc 423 OW PBD(49, 7) 37.4 RD 24.76 2.42
534 OW RD 31.3 RD 23.27 4.00
855 OW PBD(125, 5) 21.9 RD 21.83 7.07
2571 OW PBD(49, 7) 28.2 RD 17.32 7.89
sac 2060 OW RD 21.1 RD 15.83 1.25
2295 OW PBD(125, 5) 20.3 RD 17.95 5.63
2499 OW PBD(49, 7) 16 RD 17.13 2.22
3261 PW RD 30.7 RD 15.40 2.05

Popular repositories Loading

  1. DeepPerf DeepPerf Public

    DeepPerf is an end-to-end deep learning based solution that can train a software performance prediction model from a limited number of samples and predict the performance value of a new configuration.

    Python 15 12

  2. deeplearning-models deeplearning-models Public

    Forked from rasbt/deeplearning-models

    A collection of various deep learning architectures, models, and tips

    Jupyter Notebook

  3. nl2bash nl2bash Public

    Forked from TellinaTool/nl2bash

    Generating bash command from natural language https://arxiv.org/abs/1802.08979

    NewLisp