🔠I’m currently working on Amazon Web Services
👨‍💻 All of my projects are available at https://gurjeetssingh.github.io/Data-Analysis/
đź’¬ Ask me about Amazon Web Services
đź“« How to reach me gurjeet.officialid@gmail.com
đź“„ Know about my experiences www.linkedin.com/in/gurjeet-singh-5b7368290
⚡ Fun fact I love exploring new hobbies whenever I get the chance!
Below is the Data Analytic Platform (DAP) diagram, which illustrates the high-level architecture and process flow that this project will follow to implement the data analytic platform for the City of Vancouver.
Perform a descriptive analysis of building permits in Vancouver using AWS services to ingest, process, and analyze data efficiently, focusing on the Project Value of properties for the years 2023 and 2024.
This project aims to demonstrate how to ingest, process, and analyze Vancouver’s building permit dataset using AWS services. The analysis focuses on understanding the Project Value for the years 2023 and 2024 to uncover key trends and insights.
Analyzing Issued Building Permits in Vancouver for 2023 and 2024
The dataset includes building permit data sourced from the Vancouver Open Data Portal.
Key Features:
The dataset was downloaded in Excel format and ingested into AWS S3 for secure storage.
Screenshot showing the S3 bucket and uploaded dataset. This demonstrates that the data has been properly ingested into AWS.
Data was cleaned using AWS Glue DataBrew to address missing values and structure the dataset for analysis.
Screenshot from AWS Glue DataBrew showing the data cleaning process, such as handling missing values or structuring the dataset.
Data was stored in an AWS S3 bucket for easy access and processing.
A data pipeline was designed using draw.io and AWS Glue to automate the steps.
If available, include a screenshot of the AWS Glue jobs running or the pipeline creation process. This would highlight the data transformation workflow.
Screenshot of AWS Athena showing query results, including any statistics you’ve generated like mean or median of project values.
The data analysis of Vancouver’s building permits provided several important insights that can inform future urban development and project planning initiatives for the City of Vancouver:
Significant Increase in High-Value Projects in 2024: Our analysis revealed a notable surge in high-value projects in 2024, especially in the residential sector. This trend could signal increased urban development in key districts, potentially spurred by favorable zoning policies or economic incentives. This shift will likely place a greater demand on city infrastructure and public services.
Concentration of Permits in Specific Districts: The majority of permits issued during this period were concentrated in rapidly growing districts such as the Mount Pleasant and Downtown Vancouver areas. This concentration of activity may suggest targeted investment in high-growth regions, but it also highlights the potential for uneven urban development, which may require policy adjustments to ensure balanced growth across the city.
Residential vs. Commercial Project Distribution: The analysis revealed that residential projects vastly outnumbered commercial ones. This may suggest a housing boom driven by population growth or increased demand for residential spaces. The City of Vancouver might need to prepare for increased strain on residential infrastructure and services.
Time Gaps Between Permit Application and Issuance: A noteworthy finding was the inconsistency in the time taken between permit application and issuance, which ranged from several days to months. This could point to potential inefficiencies within the approval process that the city might need to address to streamline development and support economic growth.
Contractor Patterns and Preferences: The data suggested a recurring set of contractors handling high-value projects, which could be indicative of trusted relationships between the city and certain developers. Monitoring these patterns could help ensure fair distribution of opportunities among contractors and maintain healthy competition in the local construction market.
These insights will help city officials, urban planners, and decision-makers better understand the landscape of urban development and tailor their strategies to meet the evolving needs of Vancouver’s residents and businesses.
The following key deliverables were produced as part of this project, each contributing to a thorough understanding of Vancouver’s building permit landscape and providing actionable insights for city officials:
Comprehensive Data Analysis Report: A detailed report documenting the entire analysis process, including data ingestion, cleaning, and analysis, using AWS services. This report highlights key trends, insights, and potential areas for policy adjustment in urban development.
Interactive Dashboards (AWS QuickSight): Visual dashboards showcasing the distribution of building permits by project value, geographical distribution, and time of issuance. These interactive visualizations will allow city planners to filter data by district, contractor, or project type to make real-time decisions.
Athena Query Results: An organized set of SQL queries run via AWS Athena, which city officials can reuse or modify to explore further data points such as permit issuance time, contractor performance, and project delays.
AWS Pipeline Documentation: A documented pipeline flowchart outlining how the data flows from ingestion (S3) to analysis (Athena) and visualization. This will help technical staff understand how the data pipeline was built and provide a foundation for future projects or data extensions.
Cost Analysis Using AWS Pricing Calculator: A detailed breakdown of the estimated costs involved in running the Data Analytic Platform (DAP) using AWS services, including S3 storage, Glue DataBrew, Athena, and QuickSight. This will allow city officials to assess the long-term financial feasibility of maintaining and scaling the platform.
Recommendations for Process Improvement: Based on the analysis, we have provided a set of actionable recommendations aimed at optimizing the building permit issuance process and ensuring more balanced urban development across Vancouver’s districts.
These deliverables not only provide actionable insights for the present but also lay the foundation for future scalability and more efficient city planning workflows.
</br> </br> </br>
The City of Vancouver has initiated a migration to AWS to implement a robust data analytic platform (DAP). This diagnostic phase focuses on ensuring that the platform is secure, well-governed, and consistently monitored. This involves applying encryption and security policies, governance frameworks, and real-time performance monitoring.
Below is the Data Analytic Platform (DAP) diagram, which outlines the architecture for Vancouver’s AWS-based data platform. This phase focuses on data protection, governance, and monitoring.
The dataset consists of operational data related to the building permits issued by the City of Vancouver. The data is stored securely using AWS S3 with encryption, governance rules, and replication rules applied. Key features include:
This project focuses on ensuring Data Protection, Data Governance, and Data Monitoring using AWS services. Here’s how each of these steps was implemented:
The architecture for the Vancouver DAP was evaluated on six key pillars of AWS Well-Architected Framework:
Operational Excellence: The DAP automates critical processes, including encryption and data replication, ensuring that it operates smoothly without manual intervention.
Security: AWS KMS and IAM roles were used to enforce strong encryption and access control policies, ensuring that all sensitive data is protected against unauthorized access.
Reliability: Bucket replication rules ensure data redundancy, and the use of CloudWatch ensures that the platform remains highly reliable, with real-time performance alerts.
Performance Efficiency: The platform uses efficient data storage and retrieval mechanisms, including optimized bucket policies and monitoring services that help minimize resource usage during peak times.
Cost Optimization: AWS Pricing Calculator was used to ensure that the costs of storage, replication, and monitoring remain within budget, while CloudWatch optimizes resource usage based on real-time demand.
Sustainability: AWS services allow for highly sustainable data processing and storage, reducing the environmental impact of the platform’s operation by utilizing cloud-based, energy-efficient services.
This diagnostic analysis revealed several critical insights into the security, governance, and monitoring aspects of the City of Vancouver’s data platform:
Enhanced Data Security: The implementation of KMS encryption and IAM roles ensures that only authorized users can access sensitive data, reducing the risk of unauthorized access.
Comprehensive Governance: With AWS Config and CloudTrail, all actions are logged, and governance policies are continuously enforced, ensuring compliance with security standards and regulations.
Proactive Monitoring and Alerting: AWS CloudWatch and SNS enable real-time monitoring of system performance, ensuring that any anomalies are detected and resolved swiftly. This reduces downtime and improves the platform’s reliability.
These insights support the overall goal of providing a secure, well-governed, and continuously monitored data platform for the City of Vancouver.
The following deliverables have been produced as part of this project:
</br> </br>
The HR department at UCW requires a robust system to monitor and evaluate recruitment processes. This diagnostic phase focuses on understanding inefficiencies in hiring by calculating the Average Days to Fill (ADF) job positions. The goal is to identify bottlenecks and provide actionable insights to improve the recruitment process over time.
The dataset consists of HR job postings and related details such as hiring dates and offer statuses. The data is stored securely using AWS S3, and AWS Glue is used for processing. Key fields include:
This project follows a diagnostic analysis process, aiming to uncover inefficiencies in HR recruitment by focusing on the Average Days to Fill (ADF) metric.
The HR job postings dataset is ingested from AWS S3 using AWS Glue.
A Glue job is created to extract the data, perform transformations, and load it into an S3 bucket for further processing.
Days to Fill
is generated, representing the number of days taken from PostingDate
to HireDate
for each job posting.HireDate
), calculating the Average Days to Fill (ADF) for each year.
The diagnostic analysis provided several actionable insights:
CSV Output: Contains the Average Days to Fill (ADF) for each hiring year, stored in S3.
QuickSight Dashboard: A visual dashboard that tracks recruitment efficiency and allows HR management to diagnose inefficiencies in their hiring processes.
By leveraging AWS Glue and AWS QuickSight, this diagnostic analysis successfully identified inefficiencies in UCW HR’s recruitment process. The insights provided will enable the HR team to optimize recruitment timelines and ensure more efficient hiring in the future.