kjy,Karen J Yang,github

angular-lc101-mission-planner

Studios for the three angular classes in the Intro to Prof Web Dev book

angular-lc101-projects

Starter code and solutions for the LaunchCode Angular lessons.

architectingwithgcp_fundamentals_course1_coreinfrastructure

Virtual Machines in Cloud Use Cloud Launcher to deploy a LAMP stack on a Compute Engine Instance. Create a Compute Engine VM using GCP console. Create a Compute Engine VM using the GCloud command-line-interface. Connect between the two instances. Storage in Cloud Create a Cloud Storage bucket and place an image in it. Configure an application running in Compute Engine to use a database managed by Cloud SQL. Configure a web server with PHP. Use the image in the Cloud Storage bucket on a web page. Containers in Cloud Create a Kubernetes Engine cluster containing several containers, each containing a web server. Deploy and manage Docker containers using kubect1. Place a load balancer in front of the cluster and view its contents. Applications in Cloud Preview an App Engine application using Cloud Shell. Launch and disable an App Engine application. Deployment in Cloud Create a deployment using Deployment Manager and maintain a consistent state of deployment. Update a deployment. View the load (resource usage) on a VM instance using Google Stackdriver. Big Data & Machine Learning in Cloud Load data from Cloud Storage into BigQuery. Perform a query on the data in BigQuery both in the console and in the shell (using BQ command).

architectingwithgcp_fundamentals_course2_essentialcloudinfrastructurefoundation

Console and Cloud Shell. Access Google Cloud Platform. Create a Cloud Storage Bucket using the GCP console and Cloud Shell. Understand shell features. Infrastructure Preview. Use Cloud Launcher to build a Jenkins Continuous integration environment. Manage the service from the Jenkins UI. Administer the service from the Virtual Machine host through SSH. Virtual Networking. Understand network layout and placing instances in various locations and establish communications between virtual machines. Create an auto-mode network, a custom-mode network, and associated subnetworks. Compare connectivity in the various types of networks. Create routes and firewall rules using IP addresses and tags to enable connectivity. Convert an auto-mode network to a custom-mode network. Create, expand, and delete subnetworks. Bastion Host. Create an application web server to represent a service provided to an internal corporate audience. Prevent the web server from access to or from the internet. Create a maintenance server, called a Bastion Host, to gain access to and verify internal connectivity to the application server. Virtual Machines. Create a utility virtual machine for administration purposes, a standard VM and a custom VM. Launch both Windows and Linux VMs and deleted VMs. Working with Virtual Machines. Create a customized virtual machine instance, using an n1-standard-1 machine type that includes a 10 GB boot disk, 1 virtual CPU (vCPU), and 3.75 GB of RAM. Machine type runs Debian Linux by default. Install base software (a headless JRE) and an application software (a Minecraft game server). Attach a high-performance 50-GB persistent solid-state drive (SSD) to the instance. Minecraft server can support up to 50 players. Reserve a static external IP so the address would remain consistent. Verified availability of the gaming server online. Set up a backup system to back up the server’s data to a Cloud Storage bucket and test the backup system. Automate backups using cron. Set up maintenance scripts using metadata for graceful startup and shutdown of the server.

architectingwithgcp_fundamentals_course3_essentialcloudinfrastructurecoreservices

Cloud Identity and Access Management (IAM). Use Cloud IAM to implement access control. Grant and revoke IAM roles, first to a user, Username2. Restrict access to specific features or resources. Allocate Service Account User credentials and “bake” them into a virtual machine to create specific-purpose authorized bastion hosts. Cloud Storage. Create and use buckets. Learn about the following features: customer-supplied encryption key (CSEK), use your own encryption keys, rotate keys, access control list (ACL), set an ACL for private, and modify public. Lifecycle management, set policy to delete objects after 31 days, versioning, create a version and restore a previous version, directory synchronization, recursively synchronize a VM directory with a bucket, cross-project resource sharing using IAM, use IAM to enable access to resources across projects. Cloud SQL. Create a Cloud SQL instance and a client VM instance to serve as a database client. Install software. Restrict access to the Cloud SQL instance to a single IP address. Download sample GCP billing data in *.csv format and load that into the database. Improve security by requiring SSL certificates--configure the Cloud SQL instance and the client to use SSL encryption. Cloud Datastore. Initialize Cloud Datastore. Create content (populate with data entities) in the Datastore database. Query the content running both “Query by kind” and “Query by GQL” queries. Access the Cloud Datastore Admin console. Enabled the Cloud Datastore Admin console to clean up and remove test data. Examining Billing Data with BigQuery. Sign in to BigQuery from the GCP console. Import billing data into BigQuery that had been generated as a CSV file. Create a dataset. Create a table. Run a simple query on the file. Access a shared dataset containing more than 22,000 records of billing information. Run queries on the data to explore how to use BigQuery to ask and answer questions. Resource Monitoring (Stackdriver). Create a Stackdriver account. Enable Stackdriver Monitoring to monitor projects. Add charts to dashboards. Create alerts with multiple conditions. Create resource groups. Create uptime checks for services. Error Reporting and Debugging (Stackdriver). Launch a Google App Engine application. Introduce a code bug to break the application. Explore Stackdriver Error Reporting to identify the issue, and then analyze the problem, finding the root cause using Stackdriver Debugger. Modified the code to fix the problem. Monitor the change by examining the results through Stackdriver.

architectingwithgcp_fundamentals_course4_essentialcloudinfrastructure_scaling-automation

Virtual Private Network (VPN). Created 2 custom networks and associated subnetwork in different regions. Created VPN gateways in each network. Established static routes to enable the gateways to pass traffic. Configured static routes to pass traffic to the VPN gateway. Established firewall rules to enable ICMP and SSH traffic. Performed most of the configuration from the command line. By configuring VPN manually, attained a better understanding of what GCP console does automatically so as to better troubleshoot a configuration. Virtual Machine Automation and Load Balancing. Created a pool of VMs, web servers, and directed traffic to them through an external network. Configured an external load balancer to use the pool, distributing work among the servers. Tested for high availability by placing a load on the service and stop a VM to simulate an outage by putting a bug in the code. Launched 2 more VMs in a secondary zone. Configured an internal load balancer. Tested the internal load balancer for work distribution and availability. Autoscaler. Created a VM, then customized it by installing software and changing a configuration setting (making Apache start on boot). Used the custom image in an instance template, and then used the image template to make a managed instance group. After all the backend and frontend parts were connected together, stress-tested the system and triggered autoscaling using Apache Bench. Goal was to set up an HTTP(S) load balancer with autoscaling and verified that it was working. Infrastructure Automation. Created an IAM service account. Create a VM. Authorized a VM to use Google Cloud API, using the service account for purpose of creating automation tools. Installed open-source software on the VM. Configured and tested the VM by deploying a Hadoop cluster. Created a global solution by generating a snapshot of the boot disk with the service account already “baked in”. Recreated the Clustermaker VM in a different region and tested it by deploying another Hadoop cluster in the new region. Learned how to use IaaS skills that can be leveraged to automate activities through the Google Cloud SDK. This is important for Site Reliability Engineering (SRE).

architectingwithgcp_fundamentals_course5_elasticcloudinfrastructure_containers

Kubernetes Load Balancing. Create a Kubernetes cluster using Google Kubernetes Engine with the built-in network load balancer. Deploy nginx into the cluster and verify that the application is working. Undeploy the application and then redeploy the cluster using Ingress to connect it to a Google Compute Engine HTTPS load balancer. Redeploy and test.

architectingwithgcp_fundamentals_course6_reliablecloudinfrastructure_designandprocess

Beginning AppServer. Created a Cloud Deployment Manager template for Appserver. Created a Deployment Manager template in YAML format. Learned to work with YAML. Related JSON to YAML and corrected syntax errors in YAML. Created a prototype template from the documentation by converting the reference to YAML. Pruned the prototype template to common and required properties. Used Gcloud commands to interrogate the GCP environment to find the exact values and URIs required to configure the template. Worked with Deployment Manager to create multiple environments for different organizations and purposes and then de-deployed them after they have served their purpose. Package and Deploy. Overview—Using Deployment Manager templates, including JINJA2 templates, created a virtual machine that loads a python application and dependencies and boots up and configures itself to run a service. Specifically, deployed a service using a pre-written Python application called “Echo” and using example Deployment Manager templates written in YAML and JINJA2. Created a deployment package suitable for Deployment Manager using the python package manager, PIP. Staged package in a Cloud Storage bucket. Manually tested the application to ensure that it was working properly. Tested the new service. Adding Load Balancing. Used a pre-written Python application called “Echo” and existing Deployment Manager templates written in JINJA2. Created a Deployment package suitable for Deployment Manager using the python package manager, PIP. Staged the package in a Cloud Storage bucket. Followed best practices and manually tested the application to ensure that it was working properly. Investigated and gathered information necessary to configure health checks. Used Deployment Manager to Deploy the Echo Load Balancer (LB) service. Tested the new service. Enabled and verified that health check was functioning. Deploy a Full Production Application with Stackdriver (monitoring). Cloned a public repository of deployment management templates. Launched a cloud service from a collection of templates. Configured basic black box monitoring of a logbook application. Enabled Stackdriver to configure monitoring, alert notifications, and to set up graphical dashboards, showing CPU usage and received packets with dynamically updating charts on the dashboard. Created an uptime check to recognize a loss of service. Established an alerting policy to trigger incident response procedures. Used Apache Bench to generate load traffic to test the system and to trigger auto scaling. Simulated a service outage to test notifications and resiliency features. Verified receipt of email notification of failure.

bigdata_apachespark_110x

bootcampethereum_alchemyuniversity

capstoneproject-group2-encodesoliditybootcamp

coding-events

course1

course2

course3

course4

cs50-stuff

data-analysis-stuff

data-engineering-zoomcamp

Free Data Engineering course!

data-engineering_hourswithexperts2020

database

dataengineering_codechallenge_python

dataengineering_gcp_course1_bigdata-mlfundamentals

Google Cloud Platform (GCP), architectures, cloud SQL, cloud Storage, Compute Engine, ML APIs, ML with BigQuery, ML with TensorFlow, DataProc with PySpark, data engineering learning path

dataengineering_gcp_course2_leveragingunstructureddata_dataproc

Google Cloud Platform, Dataproc as cloud-based implementation of Hadoop, HIVE, Pig, PySpark, ML, NLP, Sentiment Analysis, Cluster automation, CLI commands

dataengineering_gcp_course3_serverlessdataanalysis_googlebigquery-clouddataflow

BigQuery. BigQuery is a petabyte scale data warehouse on Google Cloud that can run queries. Create a query, modify a query to add clauses, subqueries, built-in function and joins. Load a CSV file into a BigQuery table using the web UI. Load a JSON file into a BigQuery table using the CLI. Export a table using the web UI. Use nested fields, regular expressions, WITH statement, and GROUP, and HAVING. Dataflow. Dataflow is a runner (execution framework). Each step is called a transform. It goes from source (BigQuery) to sink (Cloud Storage). Setup a python dataflow project using Apache Beam, which executes data processing workflows. Create a Dataflow pipeline, using filtering. Execute query locally and on the cloud. MapReduce. To process a large dataset, break up the dataset into pieces such that each compute node processes data that’s local to it. The map operations happen in parallel on chunks of the original input data. The results of these maps are sent to the reduce nodes where aggregates are calculated. Reduce node processes on key or one set of keys. Identify map and reduce operations. Execute the pipeline. Use command line parameters. Side Inputs. A side input is an additional input that your DoFn can access each time it processes an element in the input PCollection. When you specify a side input, you create a view of some other data that can be read from within the ParDo transform's DoFn while procesing each element. Load data into BigQuery and run complex queries. Execute a Dataflow pipeline that can carry out map and reduce operations, using side inputs and stream into BigQuery. Use the output of a pipeline as a side-input to another pipeline.

dataengineering_gcp_course4_serverless-machine-learning-with-tensorflow

Explore and create ML datasets. Sample the dataset and create training, validation, and testing datasets for local development of TensorFlow models. Create a benchmark to evaluate the performance of ML. TensorFlow is used for numerical computations, using directed graphs. Getting started with TensorFlow. Explore the TensorFlow python API, build a graph, run a graph, feed values into a graph. Find areas of a triangle using TensorFlow. Learning from tf.estimator. Read from python’s pandas dataframe into tf.constant, create feature columns for estimator, perform linear regression with tf.Estimator framework. Execute Deep Neural Network regression. Use benchmark dataset. Refactoring to add batching and feature creation. Refactor the input. Refactor the way the features are created. Create and train the model, Evaluate the model. Distributed training and monitoring. Create features out of input data. Train and evaluate. Monitor with Tensorboard. To run TensorFlow at scale, use Cloud ML Engine. Package up the code. Find absolute paths to data. Run the python module from the command line. Run locally using GCloud. Submit training job using GCloud. Deploy model. Make predictions. Train on a 1-million row dataset. Feature Engineering. Working with feature columns. Adding feature crosses in TensorFlow. Reading data from BigQuery. Creating datasets using Dataflow. Using a wide-and-deep model.

dataengineering_gcp_course5_building-resilient-streaming-systems-on-google-cloud-platform

Streaming is data processing for unbounded datasets (infinite datasets). Data is in motion. Cloud Pub/Sub connect applications and service through a messaging infrastructure. Pub/Sub is a global messaging queue, essentially a message bus (buffer). Message bus is reliable, has high throughput, and low latency. Pub/Sub is about capturing data and distributing data. It is serverless and global. PubSub can be the source and BigQuery for the sink when streaming events. DataFlow does batch and streaming as long as code does not change (can control late-arriving data and out-of-order data). It does continuous computations, continuous queries. Dataflow does autoscaling and rebalancing. Stream processing is best used with DataFlow. Dataflow resources are deployed on demand, per job, and work is constantly rebalanced across resources. BigQuery does analytics on both historical data and streaming data. BigQuery can query data as it arrives from streaming pipelines. BigQuery is SQL and the latency is on the order of seconds. BigQuery is good for ad hoc. Bigtable is big, fast, and autoscaling NoSQL. Bigtable uses clusters but those clusters only contain pointers to the data but do not contain the actual data. Data is in Cloud Storage. Nodes read continuous rows of data. Bigtable supports the HBase API. The latency in Bigtable is on the order of milliseconds. BigQuery and Bigtable are about user generated queries, ad hoc queries, queries that you do only once in a long while. Apache Beam is a programming model for both batch and streaming. It supports multiple runtimes. Beam supports time-based shuffle to put data in correct window. Windowing is about event time and not processing time. Beam lets you choose between high and low latency. Beam handles structured data, semi-structured, object data, and let’s you run queries. Beam offers a single pipeline--a unified model for processing batch and stream data.

datetime

kjy Goto Github PK

Karen J Yang's Projects

Recommend Projects

Recommend Topics

Recommend Org