Inital commit

This commit is contained in:
ErdemOzgen 2023-12-04 13:31:07 +03:00
parent 5196f3b9db
commit fb3819cc41
148 changed files with 5263 additions and 2 deletions

View File

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,3 @@
#index
* [[Apache Airflow]]
*

View File

@ -0,0 +1,122 @@
# Generative Modeling
A generative model describes how a dataset is generated, in terms of a probabilistic
model. By sampling from this model, we are able to generate new data.
First, we require a dataset consisting of many examples of the entity we are trying to
generate. This is known as the training data, and one such data point is called an
**observation**.
![[Screenshot from 2023-07-18 09-32-29.png]]
generative model must also be probabilistic rather than deterministic. If our model
is merely a fixed calculation, such as taking the average value of each pixel in the
dataset, it is not generative because the model produces the same output every time.
The model must include a stochastic (random) element that influences the individual
samples generated by the model.
![[Screenshot from 2023-07-18 09-34-24.png]]
One key difference is that when performing discriminative modeling, each observa
tion in the training data has a label.
Discriminative modeling estimates p( y | x) —the probability of a label y given observa
tion x.
Generative modeling estimates p(x) —the probability of observing observation x.
If the dataset is labeled, we can also build a generative model that estimates the distri
bution p(x | y) .
In other words, discriminative modeling attempts to estimate the probability that an
observation x belongs to category y. Generative modeling doesnt care about labeling
observations. Instead, it attempts to estimate the probability of seeing the observation
at all.
sample space,
density function,
parametric modeling,
maximum likelihood estimation
![[Screenshot from 2023-07-18 09-54-01.png]]
![[Pasted image 20230718102453.png]]
Generative Modeling Challenges
• How does the model cope with the high degree of conditional dependence
between features?
• How does the model find one of the tiny proportion of satisfying possible gener
ated observations among a high-dimensional sample space?
The fact that deep learning can form its own features in a lower-dimensional space
means that it is a form of representation learning. It is important to understand the
key concepts of representation learning before we tackle deep learning in the next
chapter.
### Representation Learning
The core idea behind representation learning is that instead of trying to model the
high-dimensional sample space directly, we should instead describe each observation
in the training set using some low-dimensional latent space and then learn a mapping
function that can take a point in the latent space and map it to a point in the original
domain. In other words, each point in the latent space is the representation of some
high-dimensional image.
![[Pasted image 20230718105536.png]]
## Variational Autoencoder VAE
• An encoder network that compresses high-dimensional input data into a lower-
dimensional representation vector
• A decoder network that decompresses a given representation vector back to the
original domain
![[Pasted image 20230719124201.png]]
The network is trained to find weights for the encoder and decoder that minimize the
loss between the original input and the reconstruction of the input after it has passed
through the encoder and decoder.
Autoencoders can be used to generate new data through a process called "autoencoder decoding" or "autoencoder sampling." Autoencoders are neural network models that learn to encode input data into a lower-dimensional representation (latent space) and then decode it back to reconstruct the original input. This reconstruction process can also be used to generate new data that resembles the patterns learned during training.
Here's a general approach to using an autoencoder for data generation:
1. Train an Autoencoder: Start by training an autoencoder on a dataset of your choice. The autoencoder consists of an encoder network that maps the input data to a lower-dimensional latent space and a decoder network that reconstructs the original input from the latent space representation.
2. Latent Space Exploration: After training, you can explore the learned latent space by sampling points from it. Randomly generate vectors or sample from a probability distribution to create latent space representations.
3. Decoding: Pass the sampled latent space representations through the decoder network to generate new data. The decoder will transform the latent space representations back into the original data space, generating synthetic data that resembles the patterns learned during training.
4. Control Generation: By manipulating the values of the latent space representations, you can control the characteristics of the generated data. For example, you can interpolate between two latent space points to create a smooth transition between two data samples or explore specific directions in the latent space to generate variations of a particular feature.
It's important to note that the quality of the generated data heavily depends on the quality of the trained autoencoder and the complexity of the dataset. Autoencoders are most effective when trained on datasets with clear patterns and structure.
There are variations of autoencoders, such as variational autoencoders (VAEs), that introduce probabilistic components and offer more control over the generation process. VAEs can generate data that follows a specific distribution by sampling latent variables from the learned distributions.
Remember that the generated data is synthetic and may not perfectly match the real data distribution. It's crucial to evaluate the generated samples and assess their usefulness for your specific application.
Variational autoencoders solve these problems, by introducing randomness into the
model and constraining how points in the latent space are distributed. We saw that
with a few minor adjustments, we can transform our autoencoder into a variational
autoencoder, thus giving it the power to be a generative model.
Finally, we applied our new technique to the problem of face generation and saw how
we can simply choose points from a standard normal distribution to generate new
faces. Moreover, by performing vector arithmetic within the latent space, we can ach
ieve some amazing effects, such as face morphing and feature manipulation. With
these features, it is easy to see why VAEs have become a prominent technique for gen
erative modeling in recent years.
![[Pasted image 20230719165826.png]]

View File

@ -0,0 +1,23 @@
## # What are the 4 Vs of Big Data?
 There are generally four characteristics that must be part of a dataset to qualify it as big data—volume, velocity, variety and veracity [link](https://bernardmarr.com/what-are-the-4-vs-of-big-data/#:~:text=There%20are%20generally%20four%20characteristics,%2C%20velocity%2C%20variety%20and%20veracity.)
### What is ETL
ETL provides the foundation for data analytics and machine learning workstreams. Through a series of business rules, ETL cleanses and organizes data in a way which addresses specific business intelligence needs, like monthly reporting, but it can also tackle more advanced analytics, which can improve back-end processes or end user experiences. ETL is often used by an organization to: 
- Extract data from legacy systems
- Cleanse the data to improve data quality and establish consistency
- Load data into a target database
### Apache Beam
Apache Beam is an open-source, unified programming model and set of tools for building batch and streaming data processing pipelines. It provides a way to express data processing pipelines that can run on various distributed processing backends, such as Apache Spark, Apache Flink, Google Cloud Dataflow, and others. Apache Beam offers a high-level API that abstracts away the complexities of distributed data processing and allows developers to write pipeline code in a language-agnostic manner.
The key concept in Apache Beam is the data processing pipeline, which consists of a series of transforms that are applied to input data to produce an output. A transform represents a specific operation on the data, such as filtering, mapping, aggregating, or joining. Apache Beam provides a rich set of built-in transforms, as well as the ability to create custom transforms to suit specific processing needs.
One of the main advantages of Apache Beam is its portability across different processing engines. With Apache Beam, you can write your pipeline code once and run it on multiple execution engines without modifying the code. This flexibility allows you to choose the processing engine that best fits your requirements or take advantage of the capabilities offered by different engines for specific tasks.
Apache Beam supports both batch and streaming processing. It provides a programming model that enables developers to write pipelines that can handle both bounded (batch) and unbounded (streaming) data. This makes it possible to build end-to-end data processing solutions that can handle diverse data processing scenarios.
Overall, Apache Beam simplifies the development of data processing pipelines by providing a unified model and a set of tools that abstract away the complexities of distributed data processing. It allows developers to focus on the logic of their data transformations rather than the intricacies of the underlying execution engines.

View File

@ -0,0 +1,188 @@
Haystack is an **open-source framework** for building **search systems** that work intelligently over large document collections
### The Building Blocks of Haystack
### Nodes
* Haystack offers [nodes](https://docs.haystack.deepset.ai/docs/nodes_overview) that perform different kinds of text processing
* These are often powered by the latest transformer models.
### Transformers
The Transformer model revolutionized the field of NLP and became the foundation for many subsequent advancements, including OpenAI's GPT models. Unlike earlier NLP models that relied on recurrent neural networks (RNNs) or convolutional neural networks (CNNs), the Transformer model relies on a self-attention mechanism.
The self-attention mechanism allows the Transformer model to capture dependencies between different words in a sentence or sequence by assigning different weights to each word based on its relevance to other words in the sequence. This enables the model to effectively model long-range dependencies and improve performance on various NLP tasks such as machine translation, text summarization, and question answering.
The Transformer model consists of an encoder-decoder architecture, where the encoder processes the input sequence and the decoder generates the output sequence. Both the encoder and decoder are composed of multiple layers of self-attention mechanisms and feed-forward neural networks. The model is trained using a technique called "unsupervised learning" on large amounts of text data.
Overall, the Transformer model has significantly advanced the state of the art in NLP and has become a crucial component in many applications involving natural language understanding and generation.
NLPs Transformer is a new architecture that aims to solve tasks sequence-to-sequence while easily handling long-distance dependencies. Computing the input and output representations without using sequence-aligned RNNs or convolutions and it relies entirely on self-attention. Lets look in detail what are transformers.
https://blog.knoldus.com/what-are-transformers-in-nlp-and-its-advantages/
```python
reader = FARMReader(model="deepset/roberta-base-squad2") result = reader.predict( query="Which country is Canberra located in?", documents=documents, top_k=10 )
#https://docs.haystack.deepset.ai/reference/reader-api
```
### Pipelines
```python
p = Pipeline()
p.add_node(component=retriever, name="Retriever", inputs=["Query"])
p.add_node(component=reader, name="Reader", inputs=["Retriever"])
result = p.run(query="What did Einstein work on?")
```
**Readers**, also known as Closed-Domain Question Answering systems in machine learning speak, are powerful models that closely analyze documents and perform the core task of question answering. The Readers in Haystack are trained from the latest transformer-based language models and can be significantly sped up using GPU acceleration. But it's not currently feasible to use the Reader directly on a large collection of documents.
The **Retriever** assists the Reader by acting as a lightweight filter that reduces the number of documents the Reader must process. It scans through all documents in the database, quickly identifies the relevant ones, and dismisses the irrelevant ones. It ends up with a small set of candidate documents that it passes on to the Reader.
```python
p = ExtractiveQAPipeline(reader, retriever)
result = p.run(query="What is the capital of Australia?")
```
You can't do question answering with a Retriever only. And with just a Reader, it would be unacceptably slow. The power of this system comes from the combination of the two nodes.
### Agent
[The Agent](https://docs.haystack.deepset.ai/docs/agent) is a very versatile, prompt-based component that uses a large language model and employs reasoning to answer complex questions beyond the capabilities of extractive or generative question answering. It's particularly useful for multi-hop question answering scenarios where it must combine information from multiple sources to arrive at an answer. When the Agent receives a query, it forms a plan of action consisting of steps it has to complete. It then starts with choosing the right tool and proceeds using the output from each tool as input for the next. It uses the tools in a loop until it reaches the final answer.
```python
agent = Agent( prompt_node=prompt_node, prompt_template=few_shot_agent_template, tools=[web_qa_tool], final_answer_pattern=r"Final Answer\s*:\s*(.*)", )
hotpot_questions = [ "What year was the father of the Princes in the Tower born?", "Name the movie in which the daughter of Noel Harrison plays Violet Trefusis.", "Where was the actress who played the niece in the Priest film born?", "Which author is English: John Braine or Studs Terkel?", ]
```
### REST API
To deploy a search system, you need more than just a Python script. You need a service that can stay on, handle requests as they come in, and be callable by many different applications. For this, Haystack comes with a [REST API](https://docs.haystack.deepset.ai/docs/rest_api) designed to work in production environments.
# Tutorial: Build Your First Question Answering System
DocumentStore stores the Documents that the question answering system uses to find answers to your questions. In this tutorial, were using the `InMemoryDocumentStore`, which is the simplest DocumentStore to get started with. It requires no external dependencies and its a good option for smaller projects and debugging. But it doesnt scale up so well to larger Document collections, so its not a good choice for production systems. To learn more about the DocumentStore and the different types of external databases that we support, see [DocumentStore](https://docs.haystack.deepset.ai/docs/document_store).
```python
from haystack.document_stores import InMemoryDocumentStore
document_store = InMemoryDocumentStore(use_bm25=True)
```
### In Haystack Which one connect to Documentstore first retriver or reader ?
In Haystack, both the retriever and the reader components can be connected to the document store, but the order in which they are connected depends on the specific pipeline configuration and use case.
The document store is responsible for storing and indexing the documents that the retriever component will search through. It acts as the initial source of information for the retrieval process.
Typically, the retriever component is connected to the document store first. The retriever performs an initial search using a given query to retrieve a set of relevant documents or passages from the document store based on their similarity or relevance to the query. The retrieved documents or passages are then passed on to the reader component for further processing.
Once the retriever component retrieves the relevant documents or passages, the reader component is connected to the retriever's output. The reader component is responsible for extracting the answer or information from the retrieved documents or passages, typically using techniques like machine reading comprehension.
Here's an example of how the retriever and reader components can be connected to the document store in a Haystack pipeline:
```python
# Connect retriever to document store
p.add_node(component=retriever, name="Retriever", inputs=["Query"], outputs=["RetrievedDocuments"])
p.add_node(component=document_store, name="DocumentStore", inputs=["Retriever"])
# Connect reader to retriever's output
p.add_node(component=reader, name="Reader", inputs=["RetrievedDocuments"])
```
In this example, the retriever component is connected to the document store, and its output is named "RetrievedDocuments". The document store serves as the input to the retriever component. The reader component is then connected to the "RetrievedDocuments" output, using the retrieved information as its input.
Please note that the actual configuration and connection of components in your Haystack pipeline may differ based on your specific requirements and implementation.
### What is BM25 ?
https://www.elastic.co/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables
Download 517 articles from the Game of Thrones Wikipedia. You can find them in _data/build_your_first_question_answering_system_ as a set of _.txt_ files
```python
from haystack.utils import fetch_archive_from_http
doc_dir = "data/build_your_first_question_answering_system"
fetch_archive_from_http(
url="https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt1.zip",
output_dir=doc_dir
)
```
2. Use `TextIndexingPipeline` to convert the files you just downloaded into Haystack [Document objects](https://docs.haystack.deepset.ai/docs/documents_answers_labels#document) and write them into the DocumentStore:
```python
import os
from haystack.pipelines.standard_pipelines import TextIndexingPipeline
files_to_index = [doc_dir + "/" + f for f in os.listdir(doc_dir)]
indexing_pipeline = TextIndexingPipeline(document_store)
indexing_pipeline.run_batch(file_paths=files_to_index)
```
## Initializing the Retriever
Our search system will use a Retriever, so we need to initialize it. A Retriever sifts through all the Documents and returns only the ones relevant to the question. This tutorial uses the BM25 algorithm. For more Retriever options, see [Retriever](https://docs.haystack.deepset.ai/docs/retriever).
Let's initialize a BM25Retriever and make it use the InMemoryDocumentStore we initialized earlier in this tutorial:
```python
from haystack.nodes import BM25Retriever
retriever = BM25Retriever(document_store=document_store)
```
## Initializing the Reader
A Reader scans the texts it received from the Retriever and extracts the top answer candidates. Readers are based on powerful deep learning models but are much slower than Retrievers at processing the same amount of text. In this tutorial, we're using a FARMReader with a base-sized RoBERTa question answering model called [`deepset/roberta-base-squad2`](https://huggingface.co/deepset/roberta-base-squad2). It's a strong all-round model that's good as a starting point. To find the best model for your use case, see [Models](https://haystack.deepset.ai/pipeline_nodes/reader#models).
Let's initialize the Reader:
```python
from haystack.nodes import FARMReader
reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2", use_gpu=True)
```
### Create Retriver-Reader pipeline
In this tutorial, we're using a ready-made pipeline called `ExtractiveQAPipeline`. It connects the Reader and the Retriever. The combination of the two speeds up processing because the Reader only processes the Documents that the Retriever has passed on. To learn more about pipelines, see [Pipelines](https://docs.haystack.deepset.ai/docs/pipelines).
To create the pipeline, run:
```python
from haystack.pipelines import ExtractiveQAPipeline
pipe = ExtractiveQAPipeline(reader,retriver)
prediction = pipe.run(
query="Who is the father of Arya Stark?",
params={
"Retriever": {"top_k": 10},
"Reader": {"top_k": 5}
}
)
```

View File

@ -0,0 +1 @@
#index

View File

@ -0,0 +1,4 @@
* https://www.darkreading.com/dr-tech/10-free-purple-team-security-tools-2023
* https://www.darkreading.com/application-security/10-cool-security-tools-open-sourced-by-the-internet-s-biggest-innovators
* https://github.com/danluu/post-mortems
*

View File

@ -0,0 +1,3 @@
#todos
# write about

View File

@ -0,0 +1,4 @@
#index
* [[medium article ideas]]
* [[academic paper ideas]]

View File

@ -0,0 +1 @@
#index

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,50 @@
1**Multi-tenancy** is a **software architecture where a single software instance or application serves multiple customers or user groups, called tenants**[1](https://www.bing.com/ck/a?!&&p=03811f617effd63cJmltdHM9MTY4MTI1NzYwMCZpZ3VpZD0zZWFhOWFlOC03NGY3LTY4M2ItMGMwYi04ODMyNzU2ZTY5ZTkmaW5zaWQ9NTQ1Ng&ptn=3&hsh=3&fclid=3eaa9ae8-74f7-683b-0c0b-8832756e69e9&psq=multi+tenancy+meaning&u=a1aHR0cHM6Ly93d3cucmVkaGF0LmNvbS9lbi90b3BpY3MvY2xvdWQtY29tcHV0aW5nL3doYXQtaXMtbXVsdGl0ZW5hbmN5&ntb=1)[2](https://www.bing.com/ck/a?!&&p=d4bd1f2f65e9884cJmltdHM9MTY4MTI1NzYwMCZpZ3VpZD0zZWFhOWFlOC03NGY3LTY4M2ItMGMwYi04ODMyNzU2ZTY5ZTkmaW5zaWQ9NTQ1Nw&ptn=3&hsh=3&fclid=3eaa9ae8-74f7-683b-0c0b-8832756e69e9&psq=multi+tenancy+meaning&u=a1aHR0cHM6Ly93d3cudGVjaHRhcmdldC5jb20vd2hhdGlzL2RlZmluaXRpb24vbXVsdGktdGVuYW5jeQ&ntb=1)[3](https://www.bing.com/ck/a?!&&p=c3cbe2e48c656247JmltdHM9MTY4MTI1NzYwMCZpZ3VpZD0zZWFhOWFlOC03NGY3LTY4M2ItMGMwYi04ODMyNzU2ZTY5ZTkmaW5zaWQ9NTQ1OA&ptn=3&hsh=3&fclid=3eaa9ae8-74f7-683b-0c0b-8832756e69e9&psq=multi+tenancy+meaning&u=a1aHR0cHM6Ly93d3cuc2ltcGxpbGVhcm4uY29tL3doYXQtaXMtbXVsdGl0ZW5hbmN5LWFydGljbGU&ntb=1). It is the opposite of single tenancy, when a software instance or system has only one user or group[1](https://www.bing.com/ck/a?!&&p=49cc17b36f4fb96aJmltdHM9MTY4MTI1NzYwMCZpZ3VpZD0zZWFhOWFlOC03NGY3LTY4M2ItMGMwYi04ODMyNzU2ZTY5ZTkmaW5zaWQ9NTQ1OQ&ptn=3&hsh=3&fclid=3eaa9ae8-74f7-683b-0c0b-8832756e69e9&psq=multi+tenancy+meaning&u=a1aHR0cHM6Ly93d3cucmVkaGF0LmNvbS9lbi90b3BpY3MvY2xvdWQtY29tcHV0aW5nL3doYXQtaXMtbXVsdGl0ZW5hbmN5&ntb=1). Multi-tenancy is the backbone of cloud computing, where software is hosted, provisioned and managed by a cloud provider and accessed by users over the Internet[1](https://www.bing.com/ck/a?!&&p=df9f98904547e6d5JmltdHM9MTY4MTI1NzYwMCZpZ3VpZD0zZWFhOWFlOC03NGY3LTY4M2ItMGMwYi04ODMyNzU2ZTY5ZTkmaW5zaWQ9NTQ2MA&ptn=3&hsh=3&fclid=3eaa9ae8-74f7-683b-0c0b-8832756e69e9&psq=multi+tenancy+meaning&u=a1aHR0cHM6Ly93d3cucmVkaGF0LmNvbS9lbi90b3BpY3MvY2xvdWQtY29tcHV0aW5nL3doYXQtaXMtbXVsdGl0ZW5hbmN5&ntb=1)[4](https://www.bing.com/ck/a?!&&p=8d572e87cfb1426eJmltdHM9MTY4MTI1NzYwMCZpZ3VpZD0zZWFhOWFlOC03NGY3LTY4M2ItMGMwYi04ODMyNzU2ZTY5ZTkmaW5zaWQ9NTQ2MQ&ptn=3&hsh=3&fclid=3eaa9ae8-74f7-683b-0c0b-8832756e69e9&psq=multi+tenancy+meaning&u=a1aHR0cHM6Ly93d3cudGVjaG9wZWRpYS5jb20vZGVmaW5pdGlvbi8xNjYzMy9tdWx0aXRlbmFuY3k&ntb=1). The tenants are logically isolated from each other, but physically integrated in the shared environment[5](https://www.bing.com/ck/a?!&&p=6c332eaed65358b6JmltdHM9MTY4MTI1NzYwMCZpZ3VpZD0zZWFhOWFlOC03NGY3LTY4M2ItMGMwYi04ODMyNzU2ZTY5ZTkmaW5zaWQ9NTQ2Mg&ptn=3&hsh=3&fclid=3eaa9ae8-74f7-683b-0c0b-8832756e69e9&psq=multi+tenancy+meaning&u=a1aHR0cHM6Ly93d3cuZ2FydG5lci5jb20vZW4vaW5mb3JtYXRpb24tdGVjaG5vbG9neS9nbG9zc2FyeS9tdWx0aXRlbmFuY3k&ntb=1). They can customize some aspects of the software, but not the code.
**SCALABILITY** - ability of a _software system_ to process higher amount of workload on its current hardware resources (_scale up_) or on current and additional hardware resources (_scale out_) without application service interruption;
**ELASTICITY** - ability of the _hardware layer_ below (usually cloud infrastructure) to increase or shrink the amount of the physical resources offered by that hardware layer to the software layer above. The increase / decrease is triggered by business rules defined in advance (usually related to application's demands). The increase / decrease happens on the fly without physical service interruption.
Scalability is the ability of the system to accommodate larger loads just by adding resources either making hardware stronger (scale up) or adding additional nodes (scale out).
Elasticity is the ability to fit the resources needed to cope with loads dynamically usually in relation to scale out. So that when the load increases you scale by adding more resources and when demand wanes you shrink back and remove unneeded resources. Elasticity is mostly important in Cloud environments where you pay-per-use and don't want to pay for resources you do not currently need on the one hand, and want to meet rising demand when needed on the other hand.
**AWS Regions**
AWS has the concept of a Region, which is a physical location around the world where we cluster data centers. We call each group of logical data centers an Availability Zone. Each AWS Region consists of a minimum of three, isolated, and physically separate AZs within a geographic area.
**Avability Zones**
An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. AZs give customers the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center. All AZs in an AWS Region are interconnected with high-bandwidth, low-latency networking, over fully redundant, dedicated metro fiber providing high-throughput, low-latency networking between AZs. All traffic between AZs is encrypted. The network performance is sufficient to accomplish synchronous replication between AZs. AZs make partitioning applications for high availability easy. If an application is partitioned across AZs, companies are better isolated and protected from issues such as power outages, lightning strikes, tornadoes, earthquakes, and more. AZs are physically separated by a meaningful distance, many kilometers, from any other AZ, although all are within 100 km (60 miles) of each other.
**AWS Edge Locations**
Edge locations are endpoints for AWS which are used for **caching content** and used as Content delivery network (CDN).
This consists of Amazon Cloud front (CF).There are many more edge locations than regions (217 Points of Presence (205 Edge Locations and 12 Regional Edge Caches)) across globe.
Edge locations serve requests for CloudFront and Route 53. CloudFront is a content delivery network, while Route 53 is a DNS service. Requests going to either one of these services will be routed to the nearest edge location automatically. **This allows for low latency no matter where the end user is located**.
**AWS Local Zones**
AWS Local Zones allow you to use select AWS services, like compute and storage services, closer to more end-users, providing them very low latency access to the applications running locally.
AWS Local Zones are also connected to the parent region via Amazons redundant and very high bandwidth private network, giving applications running in AWS Local Zones fast, secure, and seamless access to the rest of AWS services.
AWS Local Zones have their own connection to the internet and support AWS Direct Connect, so resources created in the Local Zone can serve **local end-users** with very low-latency communications.
# Policy and Role
Users can manage access in AWS through the creation of policies and then associating them with IAM identities or AWS resources. The policy is an AWS object that defines permissions of identity or resource, with which it associates.
AWS undertakes an evaluation of these policies upon the request by a principal entity such as user or role.
[docs](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html#Overview%20of%20Json%20Policies)
Elastic Beanstalk or Elastic Container service ?
EB vs ECS really comes down to control. Do you want to control your scaling and capacity or do you want to have that more abstracted and instead focus primarily on your app. ECS will give you control, as you have to specify the size and number of nodes in the cluster and whether or not auto-scaling should be used. With EB, you simply provide a Dockerfile and EB takes care of scaling your provisioning of number and size of nodes, you basically can forget about the infrastructure with the EB route.
Here's the EB documentation on Docker: [http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker.html](http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker.html)
With ECS you'll have to build the infrastructure first before you can start deploying the the Dockerfile so it really comes down to 1) your familiarity with infrastructure and 2) level of effort that you want to spend on the infrastructure vs the app.

View File

@ -0,0 +1,8 @@
* AWS Cloud Practitioner Udemy Course [url](https://havelsan.udemy.com/course/aws-certified-cloud-practitioner-training-course/learn/lecture/26140426#overview)
* Question Practice [url](https://havelsan.udemy.com/course/practice-exams-aws-certified-cloud-practitioner/learn/quiz/4915789#overview)
* Study Guide book [https://github.com/mohankumarbm/aws-ccp-certification](https://github.com/mohankumarbm/aws-ccp-certification])
* Github Markdown Notes [url](https://github.com/kennethleungty/AWS-Certified-Cloud-Practitioner-Notes)
* Cheatsheet [url](https://digitalcloud.training/category/aws-cheat-sheets/aws-cloud-practitioner/)
* This book [url](https://lib-5jhezsvfqkepb7glrqx6ivwm.1lib.me/book/5240517/361808)
* Learn AWS in month of lunches

View File

@ -0,0 +1,4 @@
* https://engineering.opsgenie.com/convert-radio-waves-to-alerts-using-sdr-aws-lambda-and-amazon-transcribe-7ba64f8eefa
* Create AMI for aws pentesting image create from strach
* Create price calculator for aws with chatGpt and open source chatgpt frameworks or just use lambda functions like [this] (https://alexanderhose.com/implementing-chatgpt-on-aws-a-step-by-step-guide/)

View File

@ -0,0 +1,2 @@
DEVOPS AWS
https://www.edx.org/xseries/aws-devops-on-aws?index=product&queryID=f7dea23ade2982a1e0b7acacf47ae165&position=1

View File

@ -0,0 +1,57 @@
Study guide
#Chapter1 Cloud
Since theres no human processing involved in cloud compute billing, its as easy for a provider to charge a few pennies as it is thousands of dollars. This metered payment makes it possible to consider entirely new ways of testing and delivering your applications, and it often means your cost-cycle expenses will be considerably lower than they would if you were using physical servers running on-premises.
Comparing the costs of cloud deployments against on-premises deployments requires that you fully account for both capital expenses (capex) and operating expenses (opex). On-premises infrastructure tends to be very capex-heavy since you need to purchase loads of expensive hardware up front. Cloud operations, on the other hand, involve virtually no capex costs at all. Instead, your costs are ongoing, consisting mostly of perhour resource “rental” fees.
# Cloud Platform Models
* Infrastructure as a Service
Youll learn much more about these examples later in the book, but AWS IaaS products include Elastic Cloud Compute (EC2) for virtual machine instances, Elastic Block Store (EBS) for storage volumes, and Elastic Load Balancing
* Platform as a Service
AWS PaaS products include Elastic Beanstalk and Elastic Container Service (ECS).
* Software as a Service
While some may disagree with the designation, AWS SaaS products arguably include Simple Email Service and Amazon WorkSpaces.
* ![[Screenshot from 2023-03-23 10-00-42.png]]
* Serverless
*The serverless model—as provided by services like AWS Lambda—makes it possible to design code that reacts to external events. When, for instance, a video file is uploaded to a repository (like an AWS S3 bucket or even an on-premises FTP site), it can trigger a Lambda function that will convert the file to a new video format. Theres no need to maintain and pay for an actual instance running 24/7, just for the moments your code is actually running. And theres no administration overhead to worry about.
While the precise layout and organization will change over time, as of this writing the main AWS documentation page can be found at https://docs.aws.amazon.com. There youll find links to more than 100 AWS services along with tutorials and projects, software development kits (SDKs), toolkits, and general resources.
https://aws.amazon.com/premiumsupport/knowledge-center/ is basically a frequently asked questions (FAQ) page that accidentally swallowed a family packsized box of steroids and then walked through the radioactive core of a nuclear power plant wearing wet pajamas. Or, in simpler terms, theres a lot of information collected here.
The page, found at https://aws.amazon.com/security/security-resources, points to AWS blogs, white papers, articles, and tutorials covering topics such as security best practices and encrypting your data in transit and at rest.
# AWS Global Infrastructure: AWS Regions
AWS performs its cloud magic using hundreds of thousands of servers maintained within physical data centers located in a widely distributed set of geographic regions.
Dividing resources among regions lets you do the following:
* Locate your infrastructure geographically closer to your users to allow access with the lowest possible latency
* Locate your infrastructure within national borders to meet regulatory compliance with legal and banking rules
* Isolate groups of resources from each other and from larger networks to allow the greatest possible security
AWS Shared Responsibility ==> security and integrity of resource you run on cloud your problem but cloud itself is managed by aws.
Comparing the costs of cloud deployments against on-premises deployments requires that you fully account for both capital expenses (capex) and operating expenses (opex). On-premises infrastructure tends to be very capex-heavy since you need to purchase loads of expensive hardware up front. Cloud operations, on the other hand, involve virtually no capex costs at all. Instead, your costs are ongoing, consisting mostly of per hour resource “rental” fees.
![[Screenshot from 2023-04-03 22-53-28.png]]
IAAS Infrastructure as a Service ==> AWS EC2, Elastic block storage EBS
PAAS Platform as a service ==> aws elastic beanstalk and elastic container service (ECS)
SAAS Software as a Service => simple email service, aws workspace
serverless model => Lambda
# Scalability and Elasticity
scalable service will automatically grow in capacity to seamlessly meet any changes in demand. large cloud provider like AWS will, for all practical purposes, have endless available capacity so the only practical limit to the maximum size of your application is your organizations budget
Elasticity The reason the word elastic is used in the names of so many AWS services (Elastic Compute Cloud, Elastic Load Balancing, Elastic Beanstalk, and so on) is because those services are built to be easily and automatically resized.
Understand how scalability allows applications to grow to meet need. A cloud-optimized application allows for automated provisioning of server instances that are designed from scratch to perform a needed compute function within an appropriate network environment. Understand how elasticity matches compute power to both rising and falling demand. The scaling services of a cloud provider—like AWS Auto Scaling—should be configured to force compliance with your budget and application needs. You set the upper and lower limits, and the scaler handle

View File

@ -0,0 +1 @@
#index

View File

@ -0,0 +1,115 @@
Tools for Microsoft provides version control, reporting, requirements management. Project management. automated builds , testing and release capabilities.
### Continuous Integration 
- Automated tests make sure that the bugs are captured in the early phases, and fewer bugs reach the production phase. 
- After the issues are resolved efficiently, it becomes easy to build the release.
- Developers are alerted when they break any build, so they have to rebuild and fix the build before moving forth on to the next task.
- - As Continuous Integration can run multiple texts within seconds, the costs for testing decreases excessively.
- When lesser time is invested in testing, more time can be spent in the improvement of quality.
### Continuous Delivery
- The process of deploying software is no more complex, and now the team does not need to spend a lot of time preparing the release anymore.
- - The releases can be made more frequently, this in turn speeds up the feedback loop with the customers.
- The iterations in the case of the process become faster.
### Continuous Deployment
- There is no need to stop the development for releases anymore, as the entire deployment process is now automated.
- - The release process is less prone to risks and is easily fixable in the case of any issues, as only the small batches of changes are deployed.
- There is a continuous chain of improvements in quality with every passing day. The process of development now does not take long duration like a month or a year.
Continuous Delivery vs Deployment
Continuous Delivery is a software engineering practice where the code changes are prepared to be released.
Continuous Deployment aims at continuously releasing the code changes into the production environment.
# Azure pipelines
* **Build pipelines**:
These takes instructions from yaml file and build and publish artifacts from cloned source code.
* **Release pipeline**
These pipelines are deploy build artifacts into Agent machines.
* **Create release**
This one help us for complete end to end pipeline for ci/cd impl.
Example azure yaml templates [url](https://github.com/microsoft/azure-pipelines-yaml)
Azure Board supports Agile boards
# Azure DevSecOps [URL](https://havelsan.udemy.com/course/devsecops-with-azure-devops/learn/lecture/33386494#overview)
![[Screenshot from 2023-03-13 14-15-06.png]]
* [[SAST(Static Application Security testing)]]
* [[SCA (Software Composition Analysis)]]
* [[DAST (Dynamic Application Security Testing)]]
* [[IAST(Interactive Application Security Testing)]]
* [[IAC(infrastructure as code)]]
* [[API Security]]
Shift left approach is DevSecOps approach.
## Development stage
* Git secrets
* Security Plugins in IDE
* TruffleHog (has enterprise license) similar to git secrets
## Security
* Code Quality tools (Sonarqube)
* SAST security tools (Fortify, Veracode,Chackmarx)
* SCA tools (Snyk,veracode, fortify,blackduck)
* DAST tools (OWASP,ZAP,WebInspect,Veracode,DAST,ACunetix)
* IAC tools (Synk, bridgecrew)
* Container security (Aqua,Qualys,PrismaCloud)
## Operations
* Build pipeline tools (Jenkins, AWS, GCP Cloudbuild,Azure devops, github actions, Gitlab)
* Cloud security posture (AQUA, bridgeCrews)
* Container Registry Scanning Tools (Aqua,AWS native registry scanning)
* Infrastructure Scanning tools ( Chef inspec(Compliance) ,nessus)
* Clouud security (Azure defender, aws security hub )
# Devsecops in Azure DevOps
![[Screenshot from 2023-03-13 14-34-07.png]]
Take a look at repository section.
https://github.com/asecurityguru/just-another-vulnerable-java-application
Added Azure DevOps yaml ==>
https://github.com/asecurityguru/devsecops-azure-devops-simple-yaml-file-repo
# SonarCloud
SaaS code quality and security tool. #todos/recordingangel
Sonar cloud custom quality gate ==> for devsecops pipeline add azure yaml.
use section 4 for custom show examples.
**Need to add quality gate for our pipeline**
Use enviroment section in azure devops for token in YAML.
# Snyk
* Source code
* SaaS
* Open source Third party libraries
* Containers
* Infra as Code.
# OWASP ZAP

View File

@ -0,0 +1,25 @@
https://www.cybersecasia.net/tips/nine-devsecops-scanning-tools-to-keep-the-bad-guys-at-bay
**[gitLeaks](https://github.com/zricethezav/gitleaks)**
**[Git-Secrets](https://github.com/awslabs/git-secrets)**
**[Whispers](https://github.com/Skyscanner/whispers)**
**[GitHub Secret Scanning](https://docs.github.com/en/developers/overview/secret-scanning)**
**[GittyLeaks](https://github.com/kootenpv/gittyleaks)**
**[Scan](https://slscan.io/)**
**[Git-all-secrets](https://github.com/anshumanbh/git-all-secrets)**
**[Detect-secrets](https://github.com/Yelp/detect-secrets)**
**[SpectralOps](https://spectralops.io/)**
https://github.com/Comcast/xGitGuard
[TruffleHog](https://github.com/trufflesecurity/trufflehog)

View File

@ -0,0 +1,10 @@
[[Efective DevOps Building a Culture of Collaboration, Afnity, and Tooling at Scale]]
[[Kubernetes]]
# Books to Read
• The Phoenix Project by Gene Kim
* Continuous Delivery: Reliable Soft ware Releases through Build, Test, and Deployment Automation
* hands on security devops
* DevOpsSec book

View File

@ -0,0 +1,36 @@
## Design and Practice if Security Architecture via DevSecOps Technology DOI:10.1109/ICSESS54813.2022.9930212
![[Screenshot from 2023-03-15 10-31-39.png]]
![[Screenshot from 2023-03-15 10-31-59.png]]
DevSecOps architecture design is divided into 10 phases.
DevSecOps architecture is
designed to meet the international leading cloud native security
4C model (CNCF standard: cloud, cluster, container, code) and
security development life cycle (Microsoft standard) evaluation
system, across the two areas of R&D performance and security,
security is introduced into every stage of the R&D process
(DORA Level 5 standard: Integrate security in the
requirements, design, build, test, and deployment phases).
![[Screenshot from 2023-03-15 10-41-07.png]]
## Implementation of DevSecOps by Integrating Static and Dynamic Security Testing in CI/CD Pipelines DOI:10.1109/ICOSNIKOM56551.2022.10034883
https://github.com/lianahq/skinner ==> Python script named Skinner performs
automated security testing with Burp Suite Pro on the GitLab
CI pipeline using the DevSecOps implementation procedure.
## Challenges and solutions when adopting DevSecOps: A systematic review [https://doi.org/10.1016/j.infsof.2021.106700](https://doi.org/10.1016/j.infsof.2021.106700 "Persistent link using digital object identifier")
![[Screenshot from 2023-03-15 13-01-09.png]]
# Challanges About DevSecOps
![[Screenshot from 2023-03-15 13-41-20.png]]
![[Screenshot from 2023-03-15 14-49-53.png]]
![[Screenshot from 2023-03-15 14-50-13.png]]![[Screenshot from 2023-03-15 14-52-15.png]]

View File

@ -0,0 +1,206 @@
User access keys provide programmatic access to AWS services using the CLI, PowerShell Tools, and SDKs:
• Allow full access to AWS under the user's assigned group/policies
• Do NOT hard-code access keys in source code
• Do NOT check in to source control repositories
• Store in the AWS credentials file, user environment variables, or not at all (more on this later)
• Rotate access keys regularly
• Remove access keys as part of employee off boarding process
### Docker Image to AMI [URL](https://stackoverflow.com/a/45146861)
### Passwords for VM SANS
Username: student Password: StartTheLabs
• Development: rapid and frictionless delivery of features through Agile and Lean methods, by small colocated teams, Continuous Integration, work managed by sticky notes on a wall, “working software over documentation”
• Operations: minimize firefighting and downtime, maximize stability and efficiency by following ITSM governance frameworks (ITIL, COBIT), rigorous change management, using standardized technology, configuration management, work managed in ticketing systems
• Security and Compliance: risk-focused, assurance of controls through stage gates, point-in-time audits, pen testing, spreadsheets, and checklists
DevOpsSec: break down the barriers with security and compliance
_Dogfooding_ is short for "Eating your own dog food," which represents the practice of using your own products.  For software developers, that means working with, as a real user, the applications you're building, or at least working closely with people who do use it.  Dogfooding provides a number of advantages, both marketing and technical.
Three key characteristics of DevOps unicorns:
1. Omnipresent culture: around values of accountability, continuous learning, collaboration, and experimentation. High levels of patience, trust, ethics, and empowerment. Little patience for waste and inefficiency in decision making and bureaucracy.
2. Technology savvy, customer-obsessed business leadership. Executives at all levels fully understand the importance of technology to their success.
3. Optimized organizational structure: prepared to rethink structure, staffing, performance metrics, and ownership
![[Screenshot from 2023-07-27 21-15-52.png]]
Security Development Lifecycle (SDL): Microsoft has had a version of its SDL since 2004. However, in 2008, they began publishing/releasing their SDL, and many companies have used their SDL to model their own internal secure development efforts. They have also released tools to assist with the security activities within the SDL, such as the Attack Surface Analyzer (https://www.microsoft.com/en-us/download/details.aspx?id=24487) and the Microsoft Threat Modeling Tool (https://www.microsoft.com/en-us/download/details.aspx?id=49168).
Amazon has developed an extensive set of cloud-based security services that are available to users of AWS
• IAM, CloudWatch, CloudTrail, Trusted Advisor, Inspector, DDOS protection, KMS, managed WAF… Shared Responsibility Model
• Understand and separate what Amazon is responsible for and what the customer is responsible for
• You are responsible for using AWS capabilities correctly AWS Cloud Compliance
• Certified operating environments for finance, healthcare, government, PCI
• Higher SLAs and detailed guidance
CAMS (or CALMS) is a common lens for understanding DevOps and for driving DevOps change Your organization succeeds when it reaches “CALMS”
• Culture: people come first
• Automation: rely on tools for efficiency and repeatability
• Lean: apply Lean engineering practices to continuously improve
• Measurement: use data to drive decisions and improvements
• Sharing: share ideas, information, and goals across silos
Developers are lazy, so make it easy for them to do the right thing. Make systems safe by default: Provide safe libraries and safe default configurations in templates and make them available to engineers. Bake security into base images and watch closely when base (“gold”) images are changed. Publish and evangelize safe patterns. • Engineering autonomy: Provide developers with self-service tools so that they can take responsibility for security in whatever they are working on. • Undifferentiated heavy lifting: Work with Amazon AWS to provide high-quality, safe infrastructure as a service, and leverage the cloud providers built-in capabilities to scale efficiently. Take advantage of AWS (cloud) APIs to do security work: snapshot drive for forensic analysis, change firewall config, inventory systems… • Scale engineering (and security) through extensive automation. • Eliminate snowflake configurations through standard deployment tools and templates. • Microservices: Assess risks at the service level and provide transparency to teams. • Continuous Deployment: Hundreds of small changes are made every day, which means that there are many chances for making small errors, so…. • Trust, but verify. No security gates or change review boards. Extensive checks in test and production (security, compliance, reliability…).
In DevOps, the goal is to automate as much of the work as possible through code. Get everything out of paper (policies, procedures, run books, checklists) and spreadsheets and into code that can be reviewed, scanned, tracked, and tested.
All code needs to be checked in to a source code control system/repository—if possible, a common repository or set of repositories shared by dev and ops—not just application code and unit tests written by developers, but database schemas, application configuration specifications, documentation, build and deployment scripts, operational scripts, job schedules, and everything needed to set up, deploy, and run the system, from the bare metal up (configuration cookbooks or manifests and associated tests, hardening templates…
MTTR: Mean Time to Recover or Repair from a failure. Together with Change Failure Rate, this measures the reliability/quality of service and availability. Some teams may want to separately track, and optimize for, MTTD—Mean Time to Detect a failure—so that they can look for ways to identify problems quickly. Note that many DevOps teams do not measure or optimize for MTTF (Mean Time to Failure) because they recognize that failures will happen. Instead, they work on trying to minimize the impact and cost of failures. See John Allspaw: https://www.kitchensoap.com/2010/11/07/mttr-mtbf-formost-types-of-f/
Change Lead Time or Cycle Time. The average time it takes to get a change or fix into production, which is a key metric for DevOps teams (and Lean teams) to optimize for. This can be measured from three points: 1. Change cycle time: from when a change was requested by the business to when it is deployed. This looks at the full value stream, both upstream and downstream of development. 2. Development change lead time: from when development starts to when the change is deployed (a subset of the change cycle time, which focuses on speeding up development, testing, and deployment) 3. Deployment lead time: from when development is finished to when the change is deployed (the tail end of the change cycle time, which focuses on speeding up acceptance testing, change control, and deployment)
## Security measurement
Measure automated test coverage for high-risk code • Track # of vulnerabilities found… and where they were found in the pipeline • Track # of vulnerabilities fixed • How long vulnerabilities remain open (window of exposure) • Type of vulnerability (OWASP Top 10) for Root Cause Analysis • Elapsed time for security testing—make feedback loops as short as possible • False positives versus true positives—improve quality of feedback • Vulnerability escape rate to production.
Continuous Deployment: from 2x/week to 50x/day • Engineers push (small) changes to production on their first day on the job A “Just Culture” shared across the organization • Blameless Postmortems (and Morgue) It is safe to make mistakes—as long as you own them and help fix them • Security Outreach: dont be a jerk to developers Measure Everything: data-driven learning and decisions • If in doubt, measure it: engineers are “addicted to data porn” • Make data visible: Etsy “worships at the church of graphs” • Use real data to improve security: “attack-driven defense”.
Automatically monitor changes to high-risk code: why is somebody changing crypto or authentication functions?
Attack-driven (and data-driven) defense: monitor attack activity at the application level in production, and use this to prioritize testing and defensive actions. What kind of attacks are you seeing? Replay these attacks to see which ones are succeeding. Make information about security attacks visible to everyone in engineering and ops.
Technology: How do you manage risks in new, rapidly evolving platforms and architectures such as microservices, cloud, containers, serverless? Integrity: Is there enough time to fully test and review changes before they make it to production? Availability: Does frequent change increase chances of failure? Confidentiality: In “you build it, you run it”, how do you control developer access to production data?
# DevOps Kata - Single Line of Code
Since DevOps is a broad topic, it can be difficult to determine if a team has enough skills and is doing enough knowledge sharing to keep the [Bus Factor](http://en.wikipedia.org/wiki/Bus_factor) low. It can also be difficult for someone interested in learning to know where to start. I thought Id try to brainstorm some DevOps katas to give people challenges to learn and refine their skills. If youre worried about your bus factor, challenge less experienced team members to do these katas, imagining the senior team members are unavailable.
## Single Line of Code
Goal: Deploy a change that involves a single line of code to production.
The Deployment Kata is also a useful tool for compliance and governance. By deploying a simple easy-tofollow change, you can walk auditors through how patches, upgrades, and other changes are made to a system, showing them all of the steps and tests, and letting them review the build artifacts and evidence created along the path.
Opportunities for security testing in Continuous Integration are limited because of the rapid cycle time in CI. Testing in CI is designed to catch regressions on a code change. In order to encourage fast feedback to developers, the entire check-in and build/test cycle has to complete within a few minutes at most, which means that tests have to execute quickly and cannot require complex setup. All of the tests that execute in CI have to provide unambiguous pass/fail results. Flakey tests, and tests that may return false positives, will be ignored by development teams. There is no time for comprehensive static or dynamic scanning in Continuous Integration.
CI often includes at least some basic static analysis (checks for hardcoded credentials, dangerous functions, dependency checks) and incremental static analysis checking if this is supported by the tools that you are using
Smoke testing, also called _build verification testing_ or _confidence testing_, is a software testing method that is used to determine if a new software [build](https://www.techtarget.com/searchsoftwarequality/definition/build) is ready for the next testing phase. This testing method determines if the most crucial functions of a program work but does not delve into finer details.
## CD
Pipeline model and control framework built on/extending Continuous Integration and Agile build/test practices • Uses latest good build from CI, packages for deployment, and release • Changes are automatically pushed to test/staging environments to conduct more realistic/comprehensive tests • Can insert manual reviews/testing/approvals between pipeline stages • Log steps and results to provide audit trail from check-in to deploy • Any failures will “stop the line”: No additional changes can be accepted until the failure is corrected • Ensures that code is always ready to be deployed: Changes may be batched up before production release
A CD workflow could consist of the following steps:
1. IDE checking for coding/security mistakes as code is entered/changed
2. Pre-commit code reviews
3. Pre-commit smoke test
4. Commit build in CI with fast feedback to developers: SAST (incremental), automated unit tests with code coverage failure, integration sanity checks (some of these steps could be done in parallel)
5. Software Component Analysis (SCA) on open-source components to identify code with known vulnerabilities (some SCA tools will also check for licensing risks)
6. Alert on high-risk code changes (e.g., unit tests that check hash value of code, or quick scanning for dangerous functions) which require review by InfoSec
7. Store binaries, configuration files, and other artifacts in repository
8. Deploy to acceptance test environment (configure and stand up test systems using Puppet/Chef, Terraform, Docker…) and run post-deployment asserts/smoke tests
9. Automated acceptance and integration testing
10. Automated performance and load tests (in parallel)
11. Automated dynamic (DAST) scans—with clear pass/fail criteria
12. Deploy to staging using same deployment tools and instructions as acceptance test—and run postdeployment asserts/smoke tests
13. Environmental and data migration tests
14. Code is now ready to be deployed to production
15. Environmental/data migration checks
16. Operations tests
17. Code is ready to be deployed and released to production
![[Screenshot from 2023-07-29 17-06-55.png]]
![[Screenshot from 2023-07-29 17-09-50.png]]
Blue/Green Deployment is a pattern for managing Continuous Deployment. You run two different environments (one “blue”, one “green”) in production. The blue environment is active. Changes are rolled out to the green environment. Once the changes are deployed and the green environment is running and warmed up, load balancing is used to reroute traffic from the blue to the green environment. Once all customer traffic has been routed to the green environment, the blue environment is available to be updated.
Canary Releasing (https://martinfowler.com/bliki/CanaryRelease.html) Another technique to minimize the impact and risk of Continuous Deployment is “canary releasing”. Changes are pushed to one server and carefully monitored to ensure that the update was done correctly, and everything is running as expected. Then the change is pushed to two servers and checked, then ten servers and checked again, then half of the servers, before finally being pushed to all servers. At any point, if monitoring or other checks determine that the change is not working as expected, the change can be automatically rolled back, or the roll out can be halted until a fix is rolled out
Before deployment, check that operational dependencies are correct After deployment, ensure that the system is set up and running correctly • Simple, end-to-end tests of core functions using test data/simulated transactions • Ensure that all connections are running • Check that monitoring functions are working correctly • Configuration checks • Version/dependency checks • Basic runtime security smoke tests to catch obvious mistakes
![[Screenshot from 2023-07-29 17-18-38.png]]
CD PIPELINE RULES
1. Use the CD pipeline for all changes to all environments: changes to code, infrastructure, and runtime configuration 2. Build the binaries once (and protect them) 3. Keep development and test environments in sync with production (as closely as possible) 4. Isolate differences between environments in runtime variables 5. Stop if any step fails—and fix it immediately 6. Run smoke tests/checks after every deployment 7. Automate repetitive/expensive work 8. Timestamp and record every step
1. Use the CD Pipeline for all changes to all environments: code changes, changes to runtime configuration, changes to infrastructure. 2. Build binaries once. Version them and sign them or otherwise protect them to ensure that they are not tampered with along the pipeline stages. 3. Use automated configuration management to set up development and test environments to match production (as closely as possible) and to keep all environments in sync. 4. Isolate differences between environments (test, acceptance, staging, production…) in runtime variables that are supplied to the configuration. 5. If any step fails, stop the line. Based on Toyotas Lean Manufacturing principles: if something is wrong, pull the “Andon Cord”. 6. Run automated health checks/smoke tests after every deployment or configuration change. 7. Automate repetitive and expensive work wherever possible—minimize manual steps and checks. 8. Audit everything, taking advantage of logs from automated tools. Protect and archive these logs to ensure integrity.
![[Screenshot from 2023-07-29 17-27-10.png]]
production runtimes are immutable, and nobody has direct access to production servers. Every change (to applications and to infrastructure) must be checked in to source control and deployed through automated pipelines. All pipelines must be identified and registered in advance. Every change must be peer reviewed and must pass several levels of testing and scanning.
https://www.cloudbees.com/blog/blue-ocean-makes-creating-pipelines-fun-and-intuitive-teams-doing-continuous-delivery
Some security tools cant be easily automated in pipelines—simpler tools that are API-driven work best • Some checks take too long and have to be done out of band • Get Security, Dev, and Ops working together to solve problems • Help engineers to write their own tests
![[Screenshot from 2023-07-29 23-22-40.png]]
In many cases, the “code is the design”, which means that to understand the design, people need to be able to read the code. And this also means that the design changes as the code changes—which is often.
CODE IS DESIGN
This makes it difficult for InfoSec to understand where and how they can review the design for security risks. How do you do threat modeling of the design when the design is never finished and is always changing?
Tools to help perform rapid risk assessments:
• PayPal risk questionnaire for new apps/services
• Mozilla Rapid Risk Assessment (RRA) model: 30-minute review
• Slack goSDL for questions to determine initial risk rating
High-level, basic risk assessments should be done in upfront platform selection and architecture decisions. This should focus on:
• Data classification: What data is sensitive, restricted, or confidential and needs to be protected? What are the legal/compliance restrictions and obligations (for auditing, archival, encryption…)?
• Security risks in platform choice (OS, cloud platform), data management solutions (SQL or NoSQL), languages, and frameworks. The team needs to understand their tools and how to use them properly.
• CD toolchain support: What scanning (DAST, SAST, IAST) tools and other test tools are available based on the language(s) and platform that the team is using?
Ask these questions when you are making changes (based on SAFECodes Tactical Threat Modeling Guide):
1. Are you changing the attack surface?
2. Are you changing the technology stack?
3. Are you changing application security controls?
4. Are you adding confidential/sensitive data?
5. Are you modifying high-risk code?
https://safecode.org/safecodepublications/tactical-threat-modeling/
### Version Control
* Local (e.g., RCS, SCCS)
• Client-Server (e.g., CVS, Subversion)
• Distributed (e.g., git, mercurial)
![[Screenshot from 2023-07-29 23-41-33.png]]
[**Code ownership is a model in which a developer or team of developers are responsible for a specific piece of code within a software project**](https://www.bing.com/ck/a?!&&p=24374ec20bcf6785JmltdHM9MTY5MDU4ODgwMCZpZ3VpZD0wNmQ1ZjAwYS0wOTVlLTYyN2QtMGFiZC1lMzNkMDhjNzYzMDkmaW5zaWQ9NTY2OA&ptn=3&hsh=3&fclid=06d5f00a-095e-627d-0abd-e33d08c76309&psq=CODEOWNERS&u=a1aHR0cHM6Ly9jb2Rlb3duZXJzLmNvbS9sZWFybi93aGF0LWlzLWNvZGUtb3duZXJzaGlwLw&ntb=1)[1](https://www.bing.com/ck/a?!&&p=12a7ff66e9db05a1JmltdHM9MTY5MDU4ODgwMCZpZ3VpZD0wNmQ1ZjAwYS0wOTVlLTYyN2QtMGFiZC1lMzNkMDhjNzYzMDkmaW5zaWQ9NTY2OQ&ptn=3&hsh=3&fclid=06d5f00a-095e-627d-0abd-e33d08c76309&psq=CODEOWNERS&u=a1aHR0cHM6Ly9jb2Rlb3duZXJzLmNvbS9sZWFybi93aGF0LWlzLWNvZGUtb3duZXJzaGlwLw&ntb=1). [Code owners can be defined in the special file named CODEOWNERS](https://www.bing.com/ck/a?!&&p=165034411758d5e3JmltdHM9MTY5MDU4ODgwMCZpZ3VpZD0wNmQ1ZjAwYS0wOTVlLTYyN2QtMGFiZC1lMzNkMDhjNzYzMDkmaW5zaWQ9NTY3MA&ptn=3&hsh=3&fclid=06d5f00a-095e-627d-0abd-e33d08c76309&psq=CODEOWNERS&u=a1aHR0cHM6Ly93d3cuamV0YnJhaW5zLmNvbS9oZWxwL3NwYWNlL2NvZGUtb3duZXJzLmh0bWw&ntb=1)[2](https://www.bing.com/ck/a?!&&p=6d50054bf9cb3a7bJmltdHM9MTY5MDU4ODgwMCZpZ3VpZD0wNmQ1ZjAwYS0wOTVlLTYyN2QtMGFiZC1lMzNkMDhjNzYzMDkmaW5zaWQ9NTY3MQ&ptn=3&hsh=3&fclid=06d5f00a-095e-627d-0abd-e33d08c76309&psq=CODEOWNERS&u=a1aHR0cHM6Ly93d3cuamV0YnJhaW5zLmNvbS9oZWxwL3NwYWNlL2NvZGUtb3duZXJzLmh0bWw&ntb=1). [People with write permissions for the repository can create or edit the CODEOWNERS file and be listed as code owners](https://www.bing.com/ck/a?!&&p=2088f5aee47567fdJmltdHM9MTY5MDU4ODgwMCZpZ3VpZD0wNmQ1ZjAwYS0wOTVlLTYyN2QtMGFiZC1lMzNkMDhjNzYzMDkmaW5zaWQ9NTY3Mg&ptn=3&hsh=3&fclid=06d5f00a-095e-627d-0abd-e33d08c76309&psq=CODEOWNERS&u=a1aHR0cHM6Ly9kb2NzLmdpdGh1Yi5jb20vZW4vcmVwb3NpdG9yaWVzL21hbmFnaW5nLXlvdXItcmVwb3NpdG9yeXMtc2V0dGluZ3MtYW5kLWZlYXR1cmVzL2N1c3RvbWl6aW5nLXlvdXItcmVwb3NpdG9yeS9hYm91dC1jb2RlLW93bmVycw&ntb=1)[3](https://www.bing.com/ck/a?!&&p=a32460b9b8ab7422JmltdHM9MTY5MDU4ODgwMCZpZ3VpZD0wNmQ1ZjAwYS0wOTVlLTYyN2QtMGFiZC1lMzNkMDhjNzYzMDkmaW5zaWQ9NTY3Mw&ptn=3&hsh=3&fclid=06d5f00a-095e-627d-0abd-e33d08c76309&psq=CODEOWNERS&u=a1aHR0cHM6Ly9kb2NzLmdpdGh1Yi5jb20vZW4vcmVwb3NpdG9yaWVzL21hbmFnaW5nLXlvdXItcmVwb3NpdG9yeXMtc2V0dGluZ3MtYW5kLWZlYXR1cmVzL2N1c3RvbWl6aW5nLXlvdXItcmVwb3NpdG9yeS9hYm91dC1jb2RlLW93bmVycw&ntb=1). [People with admin or owner permissions can require that pull requests have to be approved by code owners before they can be me
Take advantage of engineering teams that are “test obsessed”: • Ensure high levels of unit test coverage for high-risk code • Review unit tests as well as code when changes are made • Use “OWASP User Security Stories”, “Abuse Cases”, and OWASP ASVS Verification Requirements to come up with test cases (more later) • Make tests count: too many tests will make it expensive to change code • Red means STOP—ensure team does not ignore/remove broken tests • Write unit tests first when fixing vulnerabilities • Leverage unit tests to refactor buggy/complex code—cover the code in tests, then clean it up in small steps • Use Unit tests to alert on changes to high-risk code (more later)
![[Screenshot from 2023-07-30 00-45-44.png]]
![[Screenshot from 2023-07-30 01-44-31.png]]
Evil User Story: As a software engineer, I shall not be able to deploy highrisk code to production without a security review
• security controls (authentication, password handling, access control, output encoding libraries, data entitlement checks, user management, crypto methods) • admin functions • application code that works with private data • runtime frameworks • public network-facing APIs • legacy code that is known to be tricky to change (high complexity…) or that is known to be buggy • release/deployment scripts or tooling
Many organizations (especially large enterprises) operate a centralized “Scanning Factory” where code is scanned periodically, the results triaged and reviewed by InfoSec and then submitted back to the development team for remediation. However, by this time, the developers may have already moved on to other work, especially in Agile environments… and in Continuous Deployment, the code has already been deployed to production
![[Screenshot from 2023-07-30 02-02-49.png]]

View File

@ -0,0 +1,222 @@
Instead of trying to plan and design everything upfront, [[DevOps]] organizations are running continuous experiments and using data from these experiments to drive design and process improvements.
* taking advantage of new tools such as programmable con figuration managers and application release automation to simplify and scale everything from design to build and deployment and operations, and taking advantage of cloud services, virtualization, and containers to spin up and run systems faster and cheaper.
DevOps:
Infrastructure as Code:
* Chef ,puppet terraform these softwares are increase speed of building systems and scaling them.
* full visibility into configuration details, control over configuration drift and elimination of one-off snowflakes, and a way to define and automatically enforce security policies at run time.
Continuous Delivery:
Continuous monitoring and measurement
This involves creating feedback loops from production back to engineering, collecting metrics, and making them visible to everyone to understand how the system is actually used and using this data to learn and improve. You can extend this to security to provide insight into security threats and enable “Attack-Driven Defense.”
Learning From Failure
* [[chaos engineering]], and practicing for failure in game days.
Amazon has thousands of small (“two pizza”) engineering teams working inde pendently and continuously deploying changes across their infra structure. In 2014, Amazon deployed 50 million changes: thats more than one change deployed every second of every day.1 So much change so fast... How can security possibly keep up with this rate of change?
[[Lean principles]]
DevOps is heavily influenced by Lean principles: maximizing effi ciency and eliminating waste and delays and unnecessary costs.
Major security risks facing users of cloud computing services:
1. Data breaches
2. Weak identity, credential, and access management
3. Insecure interfaces and APIs
4. System and application vulnerabilities
5. Account hijacking
6. Malicious insiders
7. Advanced Persistent Threats (APTs)
8. Data loss
9. Insufficient due diligence
10. Abuse and nefarious use of cloud services
11. Denial of Service
12. Shared technology issues
#microservice
An individual [[microservice]] fits in your head, but the interrelationships among them exceed any humans understanding
Attack surface. The attack surface of any microservice might be tiny, but the total attack surface of the system can be enormous and hard to see.
Unlike a tiered web application, there is no clear perimeter, no obvious “choke points” where you can enforce authentication or access control rules. You need to make sure that trust bound aries are established and consistently enforced.
The polyglot programming problem. If each team is free to use what they feel are the right tools for the job (like at Amazon), it can become extremely hard to understand and manage security risks across many different languages and frameworks.
Logging strategy, forensics and auditing across different services with different logging approaches can be a nightmare.
[[containerForensics]]
Docker Security Risks
* Kernel exploits
* DOS attacks
one container can monopolize access to certain resourcesincluding memory and more esoteric resources such as user IDs (UIDs)—it can starve out other containers on the host, resulting in a denial-of-service (DoS), whereby legitimate users are unable to access part or all of the system.
* Container breakouts
users are not namespaced, so any process that breaks out of the container will have the same privileges on the host as it did in the container; if you were `root` in the container, you will be `root` on the host.
* Poisoned Images
* Compromising Secrets
[Docker Bench Security](https://github.com/docker/docker-bench-security)
#todos We can automatize this process when we use docker pull or etc.
You can lock down a container by using CIS guidelines and other security best practices and using scanning tools like Docker Bench, and you can minimize the containers attack surface by stripping down the runtime dependencies and making sure that developers dont package up development tools in a production container. But all of this requires extra work and knowing what to do.
Etys's DevSecOps
* Trust people to do the right thing, but still verify. Rely on code reviews and testing and secure defaults and training to prevent or catch mistakes
* If it Moves, Graph it.” Make data visible to everyone so that everyone can understand and act on it, including information about security risks, threats, and attacks. Data visulations
* “Just Ship It.” Every engineer can push to production at any time. This includes security engineers. If something is broken 17 1 Rich Smith, Director of Security Engineering, Etsy. “Crafting an Effective Security Organization.” QCon 2015 http://www.infoq.com/presentations/security-etsy and you can fix it, fix it and ship the fix out right away
* Understand the real risk to the system and to the organization and deal with problems appropriately.
“*Designated Hackers*” is a system by which each security engineer supports four or five development teams across the organization and are involved in design and standups.
![[Screenshot from 2023-03-10 13-08-33.png]]
THIS CALL AS *Shifting Security Left*
and how to take advantage of security features in their application frameworks and security libraries to prevent common security vulnerabilities like injection attacks. The OWASP and [SAFECode](http://www.safecode.org/) communities provide a lot of useful, free tools and frameworks and guidance to help developers with understanding and solving common application security problems in any kind of system.
[[FuzzingSoftware]]
OWASP Proactive Control
1. Verify for security early and often
2. parametrize queries ==> Prevent SQL injection by using a parameterized database inter face.
3. encode data
4. Validate all inputs
5. Implement identity and authentication controls
6. Implement appropriate access controls
7. Protect data
8. Implement logging and intrusion detection
9. Take advantage of security frameworks and libraries
10. Error and exception handling
**CANARY RELEASING**
Another way to minimize the risk of change in Continuous Delivery or Continuous Deployment is canary releasing. Changes can be rol led out to a single node first, and automatically checked to ensure that there are no errors or negative trends in key metrics (for exam ple, conversion rates), based on “the canary in a coal mine” meta phor. If problems are found with the canary system, the change is rolled back, the deployment is canceled, and the pipeline shut down until a fix is ready to go out. After a specified period of time, if the canary is still healthy, the changes are rolled out to more servers, and then eventually to the entire environment.s
**Honeymoon Effect**
older software that is more vulnerable is easier to attack than software that has recently been changed.
Attacks take time. It takes time to identify vulnerabilities, time to understand them, and time to craft and execute an exploit. This is why many attacks are made against legacy code with known vulner abilities. In an environment where code and configuration changes are rolled out quickly and changed often, it is more difficult for attackers to follow what is going on, to identify a weakness, and to understand how to exploit it. The system becomes a moving target. By the time attackers are ready to make their move, the code or con figuration might have already been changed and the vulnerability might have been moved or closed
Continuous Delivery is provisioning and configuring test environ ments to match production as closely as possible—automatically.This includes packaging the code and deploying it to test environ ments; running acceptance, stress, and performance tests, as well as security tests and other checks, with pass/fail feedback to the team, all automatically; and auditing all of these steps and communicating status to a dashboard.Later, you use the same pipeline to deploy the changes to production.
# Injecting Security into Continuous Delivery
Ask these questions before you start:
What happens before and when a change is checked in?
• Where are the repositories? Who has access to them?
• How do changes transition from check-in to build to Continu ous Integration and unit testing, to functional and integration testing, and to staging and then finally to production?
• What tests are run? Where are the results logged?
• What tools are used? How do they work?
• What manual checks or reviews are performed and when?
![[Screenshot from 2023-03-10 13-47-48.png]]
## Precommit
lightweight iterative threat modeling and risk assesments
SAST (Static Analysis) checking in engineers IDE
Peer code reviews ( for defensive coding and security vulnerabilities)
## Commit Stage (Continuos Integration)
This is automatically triggered by a check in. In this stage, you build and perform basic automated testing of the system. These steps return fast feedback to developers: did this change “break the build”?
security checks that you should include in this stage:
• Compile and build checks, ensuring that these steps are clean, and that there are no errors or warnings
* Software Component Analysis in build, identifying risk in thirdparty components
* Incremental static analysis scanning for bugs and security vul nerabilities
* • Alerting on high-risk code changes through static analysis checks or tests
* * Automated unit testing of security functions, with code cover age analysis
* Digitally signing binary artifacts and storing them in secure repositories (For software that is distributed externally, this should involve signing the code with a code-signing certificate from a third-party CA. For internal code, a hash should be enough to ensure code integrity.)
*
## Acceptance Stage
To minimize the time required, these tests are often fanned out to different test servers and executed in parallel. Following a “fail fast” approach, the more expensive and time-consuming tests are left until as late as possible in the test cycle, so that they are only executed if other tests have already passed.
• Secure, automated configuration management and provisioning of the runtime environment (using tools like Ansible, Chef, Puppet, Salt, and/or Docker). Ensure that the test environment is clean and configured to match production as closely as possi ble.
• Automatically deploy the latest good build from the binary arti fact repository.
• Smoke tests (including security tests) designed to catch mistakes in configuration or deployment.
• Targeted dynamic scanning (DAST).
• Automated functional and integration testing of security fea tures.
• Automated security attacks, using Gauntlt or other security tools. • Deep static analysis scanning (can be done out of band).
•Fuzzing (of APIs, files). This can be done out of band.
• Manual pen testing (out of band).
## Production Deployment and Post-Deployment
ending manual review/approvals and scheduling (in Continuous Delivery) or automatically (in Continu ous Deployment).
* Secure automated configuration managment and provisiong of runtime env
* Automated deployment and release orchestration
* Post-Deployment [[smoke test]]
* Automated runtime asserts and compliance checks (monkeys)
* Production monitoring/feedback
* Runtime defense
* Red teaming
* Bug bounties
* Blameless postmortems (learning from failure)
## Source code
Luckily, you can do this automatically by using Software Compo nent Analysis (SCA) tools like OWASPs Dependency Check project or commercial tools like Sonatypes Nexus Lifecycle or SourceClear.
OWASPs Dependency Check is an open source scanner that cata logs open source components used in an application. It works for Java, .NET, Ruby (gemspec), PHP (composer), Node.js and Python, and some C/C++ projects. Dependency Check integrates with com mon build tools (including Ant, Maven, and Gradle) and CI servers like Jenkins.
Code reviews tools needs to invatigate.
You should not rely on only one tool—even the best tools will catch only some of the problems in your code. Good practice would be to run at least one of each kind to look for different problems in the code, as part of an overall code quality and security program.
You can use tools like OWASP ZAP to automatically scan a web app for common vulnerabilities as part of the Continuous Integration/ Continuous Delivery pipeline. You can do this by running the scan ner in headless mode through the command line, through the scan ners API, or by using a wrapper of some kind, such as the ZAProxy Jenkins plug-in or a higher-level test framework like BDD-Security (which well look at in a later section)
#fuzzing
Some newer fuzzing tools are designed to run (or can be adapted to run) in Continuous Integration and Continuous Delivery. They let you to seed test values to create repeatable tests, set time boxes on test runs, detect duplicate errors, and write scripts to automatically set up/restore state in case the system crashes. But you might still find that fuzz testing is best done out of band.
Behavior-Driven Development (BDD) and TestDriven Development (TDD)—wherein developers write tests before they write the code—encourage developers to create a strong set of automated tests to catch mistakes and protect themselves from regressions as they add new features or make changes or fixes to the code
## Automated Attacks
Tools for automated attacks
• Gauntlt • Mittn • BDD-Security
## Vulnerability Management
* How many vulnerabilities have you found?
• How were they found? What tools or testing approaches are giv ing you the best returns? • What are the most serious vulnerabilities?
• How long are they taking to get fixed? Is this getting better or worse over time?
Continuous Delivery pipelines into a vulnerability manager, such as Code Dx or ThreadFix.
Continuous Delivery pipeline: • Harden the systems that host the source and build artifact repo sitories, the Continuous Integration and Continuous Delivery server(s), and the systems that host the configuration manage ment, build, deployment, and release tools. Ensure that you clearly understand—and control—what is done on-premises and what is in the cloud. • Harden the Continuous Integration and/or Continuous Deliv ery server. Tools like Jenkins are designed for developer conve nience and are not secure by default. Ensure that these tools (and the required plug-ins) are kept up-to-date and tested fre quently.
• Lock down and harden your configuration management tools. See “How to be a Secure Chef,” for example. • Ensure that keys, credentials, and other secrets are protected. Get secrets out of scripts and source code and plain-text files and use an audited, secure secrets manager like Chef Vault, Squares KeyWhiz project, or HashiCorp Vault. • Secure access to the source and binary repos and audit access to them. • Implement access control across the entire tool chain. Do not allow anonymous or shared access to the repos, to the Continu ous Integration server, or confirmation manager or any other tools. • Change the build steps to sign binaries and other build artifacts to prevent tampering. • Periodically review the logs to ensure that they are complete and that you can trace a change through from start to finish. Ensure that the logs are immutable, that they cannot be erased or forged. • Ensure that all of these systems are monitored as part of the production environment.
Runtime Application Security Protection/Self-Protection (RASP)
which uses run-time instrumentation to catch security problems as they occur. Like application firewalls, RASP can automatically identify and block attacks. And like application firewalls, you can extend RASP to leg acy apps for which you dont have source code.
There are only a small number of RASP solutions available today, mostly limited to applications that run in the Java JVM and .NET CLR, although support for other languages like Node.js, Python, and Ruby is emerging. These tools include the following:
• Immunio • Waratek • Prevoty

View File

@ -0,0 +1,18 @@
GitHub Actions goes beyond just DevOps and lets you run workflows when other events happen in your repository. For example, you can run a workflow to automatically add the appropriate labels whenever someone creates a new issue in your repository.
GitHub provides Linux, Windows, and macOS virtual machines to run your workflows, or you can host your own self-hosted runners in your own data center or cloud infrastructure.
### [Actions](https://docs.github.com/en/actions/learn-github-actions/understanding-github-actions#actions)
An _action_ is a custom application for the GitHub Actions platform that performs a complex but frequently repeated task. Use an action to help reduce the amount of repetitive code that you write in your workflow files. An action can pull your git repository from GitHub, set up the correct toolchain for your build environment, or set up the authentication to your cloud provider.
You can write your own actions, or you can find actions to use in your workflows in the GitHub Marketplace.
For more information, see "[Creating actions](https://docs.github.com/en/actions/creating-actions)."
https://youtu.be/TLB5MY9BBa4

View File

@ -0,0 +1,647 @@
Example yaml file name must be **.gitlab-ci.yaml**
```yaml
stages:
# - test
- build
- deploy
pre-job:
stage: .pre
script:
- echo 'This message is from .pre-job'
build-job:
stage: build
script:
- echo 'This message is from build-job'
test-job:
stage: test
script:
- echo 'This message is from test-job'
deploy-job:
stage: deploy
script:
- echo 'This message is from deploy-job'
post-job:
stage: .post
script:
- echo 'This message is from .post-job'
```
![[Screenshot from 2023-03-14 11-12-33.png]]
Default stages use default order other than that you can use
```yaml
stages:
-test1
-test2
-test3
```
Example:
```yaml
stages:
- build
- deploy
build:
image: node
stage: build
script:
# - apt update -y
# - apt install npm -y
- npm install
artifacts:
paths:
- node_modules
- package-lock.json
# expire_in: 1 week
deploy:
image: node
stage: deploy
script:
# - apt update -y
# - apt install nodejs -y
- node index.js > /dev/null 2>&1 & # these command runs in background and does not effect timeout
```
## Gitlab Runners
Application that works for picking CI/CD and execute CI/CD jobs.
Settings > shared runners or specific runners
Runners has tag like docker mongodb ruby. That means which can runners can handle.
for example windows tag we can use in our yaml.
```yaml
windows-info:
tags:
- windows
script:
- systeminfo
```
Runner must be same version with gitlab.
**sudo gitlab-runner register** for register runner. You can take runner token from setttings
run gitlab-runner locally
```yaml
stages:
- build_stage
- deploy_stage
build:
stage: build_stage
script:
- docker --version
- docker build -t pyapp .
tags:
- localshell
- localrunner
deploy:
stage: deploy_stage
script:
- docker stop pyappcontainer1 || true && docker rm pyappcontainer1 || true
- docker run -d --name pyappcontainer1 -p 8080:8080 pyapp
tags:
- localshell
- localrunner
```
Git-runner add admin group ==> sudo usermod -aG docker gitlab-runner
![[Screenshot from 2023-03-14 13-08-06.png]]
Variables ==> use security token,url , long string etc.
Gitlab variable [url](https://docs.gitlab.com/ee/ci/variables/predefined_variables.html)
predefine
```yaml
demo_job:
script:
- echo $CI_COMMIT_MESSAGE
- echo $CI_JOB_NAME
```
direcly set in yaml
```yaml
variables:
name: 'John'
message: 'How are you?'
display_message:
variables:
name: 'Mark'
script:
- echo "Hello $name, $message"
```
secret variable
```yaml
push_image:
script:
- docker login -u $USERNAME -p $PASSWORD
- docker tag pyapp:latest $USERNAME/mypyapp:latest
- docker push $USERNAME/mypyapp:latest
tags:
- localshell
- localrunner
```
# Enviroments
```yaml
stages:
- test
- build
- deploy staging
- automated testing
- deploy production
variables:
IMAGE_TAG: $CI_REGISTRY_IMAGE/employee-image:$CI_COMMIT_SHORT_SHA
STAGING_APP: emp-portal-staging
PRODUCTION_APP: emp-portal-production
HEROKU_STAGING: "registry.heroku.com/$STAGING_APP/web"
HEROKU_PRODUCTION: "registry.heroku.com/$PRODUCTION_APP/web"
lint_test:
image: python:3.8.0-slim
stage: test
before_script:
- pip install flake8-html
script:
- flake8 --format=html --htmldir=flake_reports/
artifacts:
when: always
paths:
- flake_reports/
pytest:
image: python:3.8.0-slim
stage: test
before_script:
- pip install pytest-html
- pip install -r requirements.txt
script:
- pytest --html=pytest_reports/pytest-report.html --self-contained-html
artifacts:
when: always
paths:
- pytest_reports/
build:
image: docker:latest
services:
- docker:dind
stage: build
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker build -t $IMAGE_TAG .
- docker images
- docker push $IMAGE_TAG
deploy_stage:
image: docker:latest
services:
- docker:dind
stage: deploy staging
environment:
name: staging
url: https://$STAGING_APP.herokuapp.com/
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker pull $IMAGE_TAG
- docker tag $IMAGE_TAG $HEROKU_STAGING
- docker login -u _ -p $HEROKU_STAGING_API_KEY registry.heroku.com
- docker push $HEROKU_STAGING
- docker run --rm -e HEROKU_API_KEY=$HEROKU_STAGING_API_KEY wingrunr21/alpine-heroku-cli container:release web --app $STAGING_APP
- echo "App deployed to stagig server at https://$STAGING_APP.herokuapp.com/"
only:
- main
test_stage:
image: alpine
stage: automated testing
before_script:
- apk --no-cache add curl
script:
- curl https://$STAGING_APP.herokuapp.com/ | grep "Employee Data"
only:
- main
deploy_production:
image: docker:latest
services:
- docker:dind
stage: deploy production
environment:
name: production
url: https://$PRODUCTION_APP.herokuapp.com/
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker pull $IMAGE_TAG
- docker tag $IMAGE_TAG $HEROKU_PRODUCTION
- docker login -u _ -p $HEROKU_PRODUCTION_API_KEY registry.heroku.com
- docker push $HEROKU_PRODUCTION
- docker run --rm -e HEROKU_API_KEY=$HEROKU_PRODUCTION_API_KEY wingrunr21/alpine-heroku-cli container:release web --app $PRODUCTION_APP
- echo "App deployed to production server at https://$PRODUCTION_APP.herokuapp.com/"Project - deploy to production
only:
- main
```
environment:
name: production
url: https://$PRODUCTION_APP.herokuapp.com/
# Dynamic enviroments
https://gitlab.com/gitlab-org/gitlab-runner/-/issues/1809
```yaml
stages:
- test
- build
- deploy feature
- automated feature testing
- deploy staging
- automated testing
- deploy production
variables:
IMAGE_TAG: $CI_REGISTRY_IMAGE/employee-image:$CI_COMMIT_SHORT_SHA
STAGING_APP: emp-portal-staging
PRODUCTION_APP: emp-portal-production
HEROKU_STAGING: "registry.heroku.com/$STAGING_APP/web"
HEROKU_PRODUCTION: "registry.heroku.com/$PRODUCTION_APP/web"
lint_test:
image: python:3.8.0-slim
stage: test
before_script:
- pip install flake8-html
script:
- flake8 --format=html --htmldir=flake_reports/
artifacts:
when: always
paths:
- flake_reports/
pytest:
image: python:3.8.0-slim
stage: test
before_script:
- pip install pytest-html
- pip install -r requirements.txt
script:
- pytest --html=pytest_reports/pytest-report.html --self-contained-html
artifacts:
when: always
paths:
- pytest_reports/
build:
image: docker:latest
services:
- docker:dind
stage: build
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker build -t $IMAGE_TAG .
- docker images
- docker push $IMAGE_TAG
deploy_feature:
image: docker:latest
services:
- docker:dind
stage: deploy feature
environment:
name: review/$CI_COMMIT_REF_NAME
url: https://$CI_ENVIRONMENT_SLUG.herokuapp.com/
before_script:
- export FEATURE_APP="$CI_ENVIRONMENT_SLUG"
- export HEROKU_FEATURE="registry.heroku.com/$FEATURE_APP/web"
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- echo "FEATURE_APP=$CI_ENVIRONMENT_SLUG" >> deploy_feature.env
- docker pull $IMAGE_TAG
- docker tag $IMAGE_TAG $HEROKU_FEATURE
- docker run --rm -e HEROKU_API_KEY=$HEROKU_STAGING_API_KEY wingrunr21/alpine-heroku-cli create $FEATURE_APP
- docker login -u _ -p $HEROKU_STAGING_API_KEY registry.heroku.com
- docker push $HEROKU_FEATURE
- docker run --rm -e HEROKU_API_KEY=$HEROKU_STAGING_API_KEY wingrunr21/alpine-heroku-cli container:release web --app $FEATURE_APP
- echo "App deployed to FEATURE server at https://$FEATURE_APP.herokuapp.com/"
artifacts:
reports:
dotenv: deploy_feature.env
only:
- /^feature-.*$/
test_feature:
image: alpine
stage: automated feature testing
before_script:
- apk --no-cache add curl
script:
- curl https://$FEATURE_APP.herokuapp.com/ | grep "Employee Data"
dependencies:
- deploy_feature
only:
- /^feature-.*$/
deploy_stage:
image: docker:latest
services:
- docker:dind
stage: deploy staging
environment:
name: staging
url: https://$STAGING_APP.herokuapp.com/
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker pull $IMAGE_TAG
- docker tag $IMAGE_TAG $HEROKU_STAGING
- docker login -u _ -p $HEROKU_STAGING_API_KEY registry.heroku.com
- docker push $HEROKU_STAGING
- docker run --rm -e HEROKU_API_KEY=$HEROKU_STAGING_API_KEY wingrunr21/alpine-heroku-cli container:release web --app $STAGING_APP
- echo "App deployed to stagig server at https://$STAGING_APP.herokuapp.com/"
only:
- main
test_stage:
image: alpine
stage: automated testing
before_script:
- apk --no-cache add curl
script:
- curl https://$STAGING_APP.herokuapp.com/ | grep "Employee Data"
only:
- main
deploy_production:
image: docker:latest
services:
- docker:dind
stage: deploy production
environment:
name: production
url: https://$PRODUCTION_APP.herokuapp.com/
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
script:
- docker pull $IMAGE_TAG
- docker tag $IMAGE_TAG $HEROKU_PRODUCTION
- docker login -u _ -p $HEROKU_PRODUCTION_API_KEY registry.heroku.com
- docker push $HEROKU_PRODUCTION
- docker run --rm -e HEROKU_API_KEY=$HEROKU_PRODUCTION_API_KEY wingrunr21/alpine-heroku-cli container:release web --app $PRODUCTION_APP
- echo "App deployed to production server at https://$PRODUCTION_APP.herokuapp.com/"Project - deploy to production
only:
- main
when: manual
```
# GitLab DevSecOps
SAST sonar cloud.
```yaml
stages:
- runSAST
run-sast-job:
stage: runSAST
image: maven:3.8.5-openjdk-11-slim
script: |
mvn verify package sonar:sonar -Dsonar.host.url=https://sonarcloud.io/ -Dsonar.organization=gitlabdevsecopsintegration -Dsonar.projectKey=gitlabdevsecopsintegration -Dsonar.login=token01
```
==================================================================
Sonar cloud quality gateways
```
1) Create Custom Quality Gate in SonarCloud and Add conditions to the Quality Gate
2) Assign this Quality Gate to the Project
3) Add script in .gitlab-ci.yml file to enable quality gate check (Note: This will fail your build in case Quality Gate fails)
sleep 5
apt-get update
apt-get -y install curl jq
quality_status=$(curl -s -u 14ad4797c02810a818f21384add02744d3f9e34d: https://sonarcloud.io/api/qualitygates/project_status?projectKey=gitLabdevsecopsintegration | jq -r '.projectStatus.status')
echo "SonarCloud Analysis Status is $quality_status";
if [[ $quality_status == "ERROR" ]] ; then exit 1;fi
-----------Sample JSON Response from SonarCloud or SonarQube Quality Gate API---------------------
{
"projectStatus": {
"status": "ERROR",
"conditions": [
{
"status": "ERROR",
"metricKey": "coverage",
"comparator": "LT",
"errorThreshold": "90",
"actualValue": "0.0"
}
],
"periods": [],
"ignoredConditions": false
}
}
```
```yaml
stages:
- runSAST
run-sast-job:
stage: runSAST
image: maven:3.8.5-openjdk-11-slim
script: |
apt-get update
apt-get -y install curl jq
mvn verify package sonar:sonar -Dsonar.host.url=https://sonarcloud.io/ -Dsonar.organization=gitlabdevsecopsintegrtion -Dsonar.projectKey=gitLabdevsecopsintegration -Dsonar.login=14ad4797c02810a818f21384add02744d3f9e34d
sleep 5
quality_status=$(curl -s -u 14ad4797c02810a818f21384add02744d3f9e34d: https://sonarcloud.io/api/qualitygates/project_status?projectKey=gitLabdevsecopsintegration | jq -r '.projectStatus.status')
echo "SonarCloud Analysis Status is $quality_status";
if [[ $quality_status == "ERROR" ]] ; then exit 1;fi
```
==================================================================
Test coverage
```
1) Unit Test cases should be present in test folder
2) Junit Plugin should be present in pom.xml
3) Jacoco Plugin should be present in pom.xml
4) Jacoco report execution goal should be present in build tag in pom.xml
5) Maven "verify" goal should be run while running sonar analysis
```
```yaml
stages:
- runSAST
run-sast-job:
stage: runSAST
image: maven:3.8.5-openjdk-11-slim
script: |
mvn verify package sonar:sonar -Dsonar.host.url=https://sonarcloud.io/ -Dsonar.organization=gitlabdevsecopsintegration -Dsonar.projectKey=gitlabdevsecopsintegration -Dsonar.login=2fda8f4a1af600afbede42c54c868083d8e34c01
```
==================================================================
SCA in gitlab security
Steps to integrate Snyk using .gitlab-ci.yml file:
1) Add Snyk Plugin to Pom.xml
2) Define Snyk Token as an environment Variable on the runner machine
3) Add code changes to .gitlab-ci.yml file
```yaml
stages:
- runSCAScanUsingSnyk
run-sca-job:
stage: runSCAScanUsingSnyk
image: maven:3.8.5-openjdk-11-slim
script: |
SNYK_TOKEN='2f4afa39-c493-4c6d-b34e-080c1a8f9014'
export SNYK_TOKEN
mvn snyk:test -fn
```
==================================================================
DAST tool using OWASP ZAP
```yaml
stages:
- runDASTScanUsingZAP
run-dast-job:
stage: runDASTScanUsingZAP
image: maven:3.8.5-openjdk-11-slim
script: |
apt-get update
apt-get -y install wget
wget https://github.com/zaproxy/zaproxy/releases/download/v2.11.1/ZAP_2.11.1_Linux.tar.gz
mkdir zap
tar -xvf ZAP_2.11.1_Linux.tar.gz
cd ZAP_2.11.1
./zap.sh -cmd -quickurl https://www.example.com -quickprogress -quickout ../zap_report.html
artifacts:
paths:
- zap_report.html
```
==================================================================
End to end CI/CD pipeline for java projects
```yaml
stages:
- runSASTScanUsingSonarCloud
- runSCAScanUsingSnyk
- runDASTScanUsingZAP
run-sast-job:
stage: runSASTScanUsingSonarCloud
image: maven:3.8.5-openjdk-11-slim
script: |
mvn verify package sonar:sonar -Dsonar.host.url=https://sonarcloud.io/ -Dsonar.organization=gitlabdevsecopsintegration -Dsonar.projectKey=gitlabdevsecopsintegration -Dsonar.login=2fda8f4a1af600afbede42c54c868083d8e34c01
run-sca-job:
stage: runSCAScanUsingSnyk
image: maven:3.8.5-openjdk-11-slim
script: |
SNYK_TOKEN='2f4afa39-c493-4c6d-b34e-080c1a8f9014'
export SNYK_TOKEN
mvn snyk:test -fn
run-dast-job:
stage: runDASTScanUsingZAP
image: maven:3.8.5-openjdk-11-slim
script: |
apt-get update
apt-get -y install wget
wget https://github.com/zaproxy/zaproxy/releases/download/v2.11.1/ZAP_2.11.1_Linux.tar.gz
mkdir zap
tar -xvf ZAP_2.11.1_Linux.tar.gz
cd ZAP_2.11.1
./zap.sh -cmd -quickurl https://www.example.com -quickprogress -quickout ../zap_report.html
artifacts:
paths:
- zap_report.html
```
# Gitlab buildin SAST and DAST
GitLab SAST Analyzer Documentation Page: https://docs.gitlab.com/ee/user/application_security/sast/
GitLab DAST Analyzer Documentation Page: https://docs.gitlab.com/ee/user/application_security/dast/
```yaml
include:
- template: Security/SAST.gitlab-ci.yml
- template: DAST.gitlab-ci.yml
variables:
SAST_EXPERIMENTAL_FEATURES: "true"
DAST_WEBSITE: http://www.example.com
DAST_FULL_SCAN_ENABLED: "true"
DAST_BROWSER_SCAN: "true"
stages:
- test
- runSASTScanUsingSonarCloud
- runSCAScanUsingSnyk
- runDASTScanUsingZAP
- dast
run-sast-job:
stage: runSASTScanUsingSonarCloud
image: maven:3.8.5-openjdk-11-slim
script: |
mvn verify package sonar:sonar -Dsonar.host.url=https://sonarcloud.io/ -Dsonar.organization=gitlabdevsecopsintegrationkey -Dsonar.projectKey=gitlabdevsecopsintegrationkey -Dsonar.login=9ff892826b54980437f4fb0fbc72f4049ec97585
run-sca-job:
stage: runSCAScanUsingSnyk
image: maven:3.8.5-openjdk-11-slim
script: |
SNYK_TOKEN='2f4afa39-c493-4c6d-b34e-080c1a8f9014'
export SNYK_TOKEN
mvn snyk:test -fn
run-dast-job:
stage: runDASTScanUsingZAP
image: maven:3.8.5-openjdk-11-slim
script: |
apt-get update
apt-get -y install wget
wget https://github.com/zaproxy/zaproxy/releases/download/v2.11.1/ZAP_2.11.1_Linux.tar.gz
mkdir zap
tar -xvf ZAP_2.11.1_Linux.tar.gz
cd ZAP_2.11.1
./zap.sh -cmd -quickurl https://www.example.com -quickprogress -quickout ../zap_report.html
artifacts:
paths:
- zap_report.html
```

View File

@ -0,0 +1,55 @@
# How to create simple setup with docker-compose
Look this [url](https://www.cloudbees.com/blog/how-to-install-and-run-jenkins-with-docker-compose)
Note: in This link jvm /java path is wrong =>/opt/java/openjdk/bin/java
# Reddit recommendations
[YOUTUBE](https://youtu.be/MTm3cb7qiEo?list=PLVx1qovxj-akoYTAboxT1AbHlPmrvRYYZ)
[Docs](https://www.jenkins.io/doc/pipeline/tour/getting-started/)
Install Linux slave in jenkins [url](https://youtu.be/pzG_ZQNbZug)
Install Windows slave in windows [url](https://youtu.be/655a1itG3xg?list=PLVx1qovxj-akoYTAboxT1AbHlPmrvRYYZ)
====================================================
Books
* Jenkins: The Definitive Guide
* Jenkins 2: Up and Running Evolve Your Deployment Pipeline for Next Generation Automation
# Jenkins: The Definitive Guide
[Github Repo Link](https://github.com/ricardoandre97/jenkins-resources.git)
Continuous Integration is about reducing risk by providing faster feedback.
First and foremost, it is designed to help identify and fix integration and regression issues faster, resulting in smoother, quicker delivery, and fewer bugs.
The practice of automatically deploying every successful build directly into production is generally known as Continuous Deployment. However, a pure Continuous Deployment approach is not always appropriate for everyone. For example, many users would not appreciate new versions falling into their laps several times a week, and prefer a more predictable (and transparent) release cycle. Commercial and marketing considerations might also play a role in when a new release should actually be deployed.
# Introducing Continuous Integration into Your Organization
Phase 1—Create Build Server
Phase 2—Nightly Builds
Phase 3—Nightly Builds and Basic Automated Tests
Phase 4—Enter the Metrics
Automated code quality and code coverage metrics. code quality build also automatically generates API documentation for the application.
P
hase 5—Getting More Serious About Testing
Test-Driven Development are more widely practiced. The application is no longer simply compiled and tested, but if the tests pass, it is automatically deployed to an application server for more comprehensive end-to-end tests and performance tests.
Phase 6—Automated Acceptance Tests and More Automated Deployment
Behavior-Driven Development and Acceptance-Test Driven Development tools to act as communication and documentation tools and documentation as much as testing tools, publishing reports on test results in business terms that non-developers can understand.The application is automatically deployed into test environments for testing by the QA team either as changes are committed, or on a nightly basis; a version can be deployed (or “promoted”) to UAT and possibly production environments using a manually-triggered build when testers consider it ready. rolling back to a previous release, if something goes horribly wrong.
Phase 7—Continuous Deployment
# Chapter 2
# Udemy Course

View File

@ -0,0 +1,154 @@
# Index
* Azure Devops ortaminin kurulmasi
* Azure dev ve hvl bulutun gosterilmesi
* ++[Local install steps](https://www.flexmind.co/azure-devops-local-server/#:~:text=Azure%20DevOps%20Server%20Installation%20Steps%20%3A%201%201.,exe%20file%20downloaded%20for%20us%20.%20More%20items)
```bash
No hosted parallelism has been purchased or granted. To request a free parallelism grant, please fill out the following form https://aka.ms/azpipelines-parallelism-reques
```
* Local Agent kurulmasi
*
### Repo url [link](https://github.com/HVLRED/azure-devops-basics)
### First pipeline yml
Name of file azure-pipeline.yml
```yaml
# Maven
# Build your Java project and run tests with Apache Maven.
# Add steps that analyze code, save build artifacts, deploy, and more:
# https://docs.microsoft.com/azure/devops/pipelines/languages/java
trigger:
- main
pool:
name: hvlubuntu
steps:
- task: Maven@1
inputs:
mavenPomFile: 'pom.xml'
publishJUnitResults: true
testResultsFiles: '**/surefire-reports/TEST-*.xml'
javaHomeOption: 'JDKVersion'
mavenVersionOption: 'Default'
mavenAuthenticateFeed: false
effectivePomSkip: false
sonarQubeRunAnalysis: false`
```
![[Pasted image 20230714155723.png]]
**CI/CD Build and Release Pipelines**
![[Pasted image 20230714155935.png]]
### Change index.jsp and trigger pipeline
## Show source code and build dir.
![[Screenshot from 2023-07-14 16-37-45.png]]
### Copy artifacts
```yaml
# Maven
# Build your Java project and run tests with Apache Maven.
# Add steps that analyze code, save build artifacts, deploy, and more:
# https://docs.microsoft.com/azure/devops/pipelines/languages/java
trigger:
- main
pool:
name: hvlubuntu
steps:
- task: Maven@1
inputs:
mavenPomFile: 'pom.xml'
publishJUnitResults: true
testResultsFiles: '**/surefire-reports/TEST-*.xml'
javaHomeOption: 'JDKVersion'
mavenVersionOption: 'Default'
mavenAuthenticateFeed: false
effectivePomSkip: false
sonarQubeRunAnalysis: false`
- task: CopyFiles@2
inputs:
Contents: '**/*.war'
TargetFolder: '$(build.artifactstagingdirectory)'
```
![[Pasted image 20230714164425.png]]
### For see results in azuredevops we need to publish artifacts
![[Pasted image 20230714164855.png]]
```yaml
# Maven
# Build your Java project and run tests with Apache Maven.
# Add steps that analyze code, save build artifacts, deploy, and more:
# https://docs.microsoft.com/azure/devops/pipelines/languages/java
trigger:
- main
pool:
name: hvlubuntu
steps:
- task: Maven@1
inputs:
mavenPomFile: 'pom.xml'
publishJUnitResults: true
testResultsFiles: '**/surefire-reports/TEST-*.xml'
javaHomeOption: 'JDKVersion'
mavenVersionOption: 'Default'
mavenAuthenticateFeed: false
effectivePomSkip: false
sonarQubeRunAnalysis: false`
- task: CopyFiles@2
inputs:
Contents: '**/*.war'
TargetFolder: '$(build.artifactstagingdirectory)'
- task: PublishPipelineArtifact@1
inputs:
targetPath: '$(Pipeline.Workspace)'
artifact: 'warfile'
publishLocation: 'pipeline'
```

View File

@ -0,0 +1,75 @@
![[Screenshot from 2023-03-14 15-53-57.png]]
**Software Dojos: Iterative Learning**
Software Dojos — or training facilities. With a globally diverse
workforce, we strive to provide ways for our employees to upskill and master DecSecOps
practices to commit to the most effective software delivery. After developing initial skills, adoption and adherence of these best practices can grow over
time. With common training grounds, our employees continue to leverage the best code
across multiple domains.
#article
## The Best of Both Worlds: Agile Development Meets Product Line Engineering at Lockheed Martin
Product line engineering (PLE) brings large-scale improvements in cost, time to market, product quality, and more. It promotes adaptive planning, evolutionary development, early
delivery, continuous improvement, and encourages rapid and flexible response to change
This paper conveys the experience of Lockheed Martin, the worlds largest defense
contractor, as it is applying PLE and Agile together on one of its largest and most important
projects. Not only is the project highly visible with demanding requirements, it is also very
large, comprising some 10 million lines of code
## PLE as Factory
Manufacturers have long used engineering techniques to create a product line of similar
products using a common factory that assembles and configures parts to produce the varying products in the product line. For example, automotive manufacturers can create thousands of unique variations of one car model using a single pool of parts carefully designed to be configurable and factories specifically designed to configure and assemble those parts.
In PLE, the configurator is the factorys automation component; the “parts” are the assets in
the factorys supply chain. A statement of the properties desired in the end product tells the
configurator how to configure the assets.
A product specification at the top tells the configurator how to configure the assets coming in from the left.This enables the rapid production of any variant of any of the assets for any of the products in the portfolio. The products can comprise any combination of software, systems in which software runs, or non-software systems that have software-representable artifacts (such as requirements, engineering models, or development plans) associated with the engineering process that produces them.
In this context “product” means not only the primary entity being built and delivered, but also all of the artifacts that are produced along with it. Some of these support the engineering process (such as requirements, project plans, design modes, and test cases), while others are delivered alongside the thing being built (such as user manuals, shipping labels, and parts lists). These artifacts are the product lines assets.
Shared assets can include, but are not limited to, requirements, design specifications, design models, source code, build files, test plans and test cases, user documentation, repair manuals and installation guides, project budgets, schedules, and work plans, product calibration and configuration files, data models, parts lists, and more.
![[Screenshot from 2023-03-14 16-07-26.png]]
PLE stands in contrast to traditional product-centric development, in which each individual product is developed and evolved independently from other products, or (at best) starts out as a cloned copy of a similar product that is then changed to suit the new products specific needs. Product-centric development takes very little advantage of the commonalities among products in a portfolio after the initial clone operation.
a production shop in which N products are developed and maintained. In this stylized view, each product comprises requirements, design models, source code, and test cases. Each engineer in this shop works primarily on a single product. When a new product is launched, its project copies the most similar assets it can find, and starts adapting them to meet the new products needs.
![[Screenshot from 2023-03-14 16-08-26.png]]
just enough, just in time” approach with just enough detail to size a project and ensure its technical and economic feasibility.
Agile provides for more level loading and resource allocation. Previously, the response to a looming deadline was to “surge,” adding resources for a milestone and then ramping back down. Now, with better planning and tighter customer involvement, that can avoided. Lessons about teaming are emerging. First, teams should be co-located, if possible. Second, this structure can expose weaker individuals; everyone needs to carry their weight, since theres no place for mediocre performers to hide on a small team. Not everyone is cut out for this approach, as it requires individuals to perform to their best abilities. Culminating each sprint with a review or demo for the customer (typically showing off new features or architectural improvements) establishes trust and instills confidence in the customer and other stakeholders.
# Recommendations from DOD
Recommendation 1: Software Factory
Recommendation 2: Continuous Iterative Development
* deliver a series of viable products (starting with MVP) followed by successive next viable products (NVPs);
* establish MVP and the equivalent of a product manager for each program in its formal
acquisition strategy, and arrange for the warfighter to adopt the initial operational
capability (IOC) as an MVP for evaluation and feedback
* engage Congress to change statutes to transition Configuration Steering Boards (CSB) to support rapid iterative approaches
Recommendation 3: Risk Reduction and Metrics for New Programs
* Sprint Burndown
* Epic and Release burndown
* Velocity
Recommendation 6: Software is Immortal Software Sustainment
Recommendation 7: Independent Verification and Validation for Machine Learning
![[Screenshot from 2023-03-14 16-20-58.png]]
![[Screenshot from 2023-03-14 16-21-19.png]]
#softwarefactory

View File

@ -0,0 +1,11 @@
# Software Enginnering
adsmsadsa
d
asd
as
d
as
da
sd
as
d

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 156 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 197 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 131 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 233 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 167 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 230 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 184 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 105 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 121 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 135 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 169 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 232 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 178 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 197 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 238 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 162 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 130 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 144 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 234 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 216 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 159 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 131 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 103 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 139 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 155 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 154 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 235 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 221 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 502 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 240 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 265 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 190 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Some files were not shown because too many files have changed in this diff Show More