Ursa Wins VLDB 2025 Best Industry Paper: The First Lakehouse-Native Streaming Engine for Kafka

By clicking "Accept all cookies" you agree to have cookies stored on your device to improve site navigation, analyze site usage, and assist with our marketing efforts. See our privacy policy for more information.

Blog

February 28, 2023

7 min read

Pulsar Operators Tutorial Part 4: Use kpack to Streamline the Build Process

Yuwei Sung

Solutions Engineer, StreamNative

Text Link

Pulsar Tutorials

Note: StreamNative now offers a unified approach to managing Pulsar clusters on Kubernetes systems, transitioning from two distinct versions of operators—Pulsar Operators (Basic Version) and StreamNative Operator (Advanced Version)—to a single, consolidated operator, StreamNative Operator, effective from the start of 2024. As part of this change, we will cease the release of new versions of Pulsar Operators, with future updates and enhancements being exclusively available through the StreamNative Operator, accessible only via StreamNative's paid services.

In the previous blog, I demonstrated how to containerize Pulsar client apps (producer and consumer) using Dockerfiles in VS Code. This is probably the most common way for the cloud-native build process. However, as Pulsar supports many languages, maintaining different Dockerfiles for Pulsar consumer/producer/function apps can be difficult as your system grows. For example, specifying dependency versions, changing base build and run images, mounting new ConfigMaps and Secrets (externalizing configurations), and adding TLS certificates can become more complicated. Using Dockerfiles forces developers to maintain those items while writing cloud-native apps.

In this blog, I will demonstrate how to streamline this process using kpack so that developers can focus on writing Pulsar producers, consumers, or functions with different languages.

Install and configure kpack

kpack is a Kubernetes operator implementing Cloud Native Buildpacks. If you like Google Cloud Build and want to implement it in your Kubernetes clusters, kpack is an ideal tool. For kpack/buildpacks details, you can find their concepts here and here.

1. kpack provides a Kubernetes operator. First, you must install the kpack operator in the Kubernetes namespace _kpack.

kubectl create namespace kpack
kubectl apply -n kpack -f https://github.com/pivotal/kpack/releases/download/v0.5.4/release-0.5.4.yaml

2. Once the operator is installed, store the pull Secret of the Docker registry so the kpack operator can store the images. You can create a Secret to store your Docker registry pull credential or robot token.

kubectl create secret -n kpack docker-registry mydocker \
                 --docker-username=<your-docker-registry-user-name> \
                 --docker-password=<your docker-hub-token> \
                 --docker-server=https://index.docker.io/v1/

3. Create a service account in the _kpack namespace and associate the Secret with this service account. Note that you need _secrets and _{imagePullSecrets} in this service account.

kubectl apply -f - <<EOF
   apiVersion: v1
   kind: ServiceAccount
   metadata:
       name: kpack-sa
       namespace: kpack
   secrets: 
   - name: mydocker
   imagePullSecrets:
   - name: mydocker
   EOF

4. Create a custom resource ClusterStore to store the necessary buildpacks. Here, I list some buildpacks for Python (cpython, python-start, pip-install, pip, procfile and ca-certificates). Refer to the kpack doc for more details.

kubectl apply -f - <<EOF
      apiVersion: kpack.io/v1alpha2
      kind: ClusterStore
      metadata:
         name: python-store
         namespace: kpack
      spec:
         sources:
         - image: gcr.io/paketo-buildpacks/cpython
         - image: gcr.io/paketo-buildpacks/pip-install
         - image: gcr.io/paketo-buildpacks/python-start
         - image: gcr.io/paketo-buildpacks/pip
         - image: gcr.io/paketo-buildpacks/procfile
         - image: gcr.io/paketo-buildpacks/ca-certificates
     EOF

5. Create a cluster stack which defines the build and run images. From this custom resource, you may find it is similar to “multi-stage build” in a Dockerfile.

kubectl apply -f - <<EOF
   apiVersion: kpack.io/v1alpha2
   kind: ClusterStack
   metadata:
     name: python-stack
     namespace: kpack
   spec:
      id: “io.buildpacks.stacks.bionic”
      buildImage:
         Image: “paketobuildpacks/build:base-cnb”
      runImage:
         Image: “paketobuildpacks/run:base-cnb”
   EOF

6. Define a builder. A kpack builder is similar to “docker build, tag, push”.

kubectl apply -f - <<EOF
apiVersion: kpack.io/v1alpha2
kind: Builder
metadata:
  name: python-builder
  namespace: kpack
spec:
  serviceAccount: kpack-sa
  tag: index.docker.io/yuwsung1/pulsar-python-producer
  stack:
    name: python-stack
    kind: ClusterStack
  store:
    name: python-store
    kind: ClusterStore
  order:
  - group:
    - id: paketo-buildpacks/ca-certificates
    - id: paketo-buildpacks/cpython
    - id: paketo-buildpacks/pip
    - id: paketo-buildpacks/pip-install
    - id: paketo-buildpacks/python-start
    - id: paketo-buildpacks/procfile
EOF

Build the client app

After you deploy a ClusterStore, a ClusterStack and a Builder, you are ready to build some images. These images are defined as Custom Resources too.

1. Create the producer image.

kubectl apply -f - <<EOF
apiVersion: kpack.io/v1alpha2
kind: Image
metadata:
  name: pulsar-producer-image
  namespace: kpack
spec:
  tag: index.docker.io/yuwsung1/pulsar-python-producer
  serviceAccountName: kpack-sa
  builder:
    name: python-builder
    kind: Builder
  source:
    git:
      url: https://github.com/yuweisung/pulsar-python
      revision: kpack
    subPath: “producer-bp"
    EOF

2. Create the consumer image.

kubectl apply -f - <<EOF
apiVersion: kpack.io/v1alpha2
kind: Image
metadata:
  name: pulsar-consumer-image
  namespace: kpack
spec:
  tag: index.docker.io/yuwsung1/pulsar-python-consumer
  serviceAccountName: kpack-sa
  builder:
    name: python-builder
    kind: Builder
  source:
    git:
      url: https://github.com/yuweisung/pulsar-python
      revision: kpack
    subPath: “consumer-bp"
    EOF

3. Once those two image CRs are applied, you can use kp (kpack cli) or kubectl to check the build status. After “Steps Completed” reaches “export”, you can find that the image is pushed to the Docker registry you specified in the “image.spec.tag”.

kubectl describe -n kpack build pulsar-consumer-image-build-1
  …
  Steps Completed:
    prepare
    analyze
    detect
    restore
    build
    export
   …

4. You can reuse the ConfigMap and Deployment in Part 3 to test the container images. The following code is the same as the one in Part 3.

kubectl apply -f - <<EOF
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: pulsar-producer-config
  namespace: pulsar-client
data:
  pulsar_url: "pulsar://10.0.0.36:6650"
  topic: "my-topic"
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: my-producer
  namespace: pulsar-client
spec:
  selector:
    matchLabels:
      app: my-producer
  replicas: 1
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: my-producer
    spec:
      containers:
        - name: pulsar-producer
          image: yuwsung1/pulsar-python-producer
          imagePullPolicy: Always
          tty: true
          resources:
            limits:
              cpu: "500m"
              memory: "128Mi"
            requests:
              cpu: "250m"
              memory: "64Mi"
          env:
            - name: PULSAR_URL
              valueFrom:
                configMapKeyRef:
                  name: pulsar-producer-config
                  key: pulsar_url
            - name: PULSAR_TOPIC
              valueFrom:
                configMapKeyRef:
                  name: pulsar-producer-config
                  key: topic
EOF

5. Create a Deployment and a ConfigMap for the consumer.

kubectl apply -f - <<EOF
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: pulsar-consumer-config
  namespace: pulsar-client
data:
  pulsar_url: "pulsar://10.0.0.36:6650"
  topic: "my-topic"
  subscription_name: "my-subscription1"
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: my-consumer
  namespace: pulsar-client
spec:
  selector:
    matchLabels:
      app: my-consumer
  replicas: 1
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: my-consumer
    spec:
      containers:
        - name: pulsar-consumer
          image: yuwsung1/pulsar-python-consumer
          imagePullPolicy: Always
          tty: true
          resources:
            limits:
              cpu: "500m"
              memory: "128Mi"
            requests:
              cpu: "250m"
              memory: "64Mi"
          env:
            - name: PULSAR_URL
              valueFrom:
                configMapKeyRef:
                  name: pulsar-consumer-config
                  key: pulsar_url
            - name: PULSAR_TOPIC
              valueFrom:
                configMapKeyRef:
                  name: pulsar-consumer-config
                  key: topic
            - name: PULSAR_SUBSCRIPTION
              valueFrom:
                configMapKeyRef:
                  name: pulsar-consumer-config
                  key: subscription_name
EOF

Once these two containers are deployed, you should find that the messages have been delivered. Then, you can follow the same ArgoCD project in Part 2. You can git push the kpack CRs to a GitHub repository and create an ArgoCD app to automate the image build process.

Conclusion

This blog shows how we can automate the container build process with two Pulsar Python client apps. As you can see, the Python code is just a GitHub repository tag in this tutorial. Whenever developers push their code to GitHub, the kpack build process will kick in and rebase the image.

You can find the example in my GitHub repositories.

More on Apache Pulsar

Pulsar has become one of the most active Apache projects over the past few years, with a vibrant community driving innovation and improvements to the project. Check out the following resources to learn more about Pulsar.

Pulsar Virtual Summit Europe 2023 will take place on Tuesday, May 23rd, 2023! See this blog to submit your session or become a community sponsor (no fee required).
Start your on-demand Pulsar training today with StreamNative Academy.
Spin up a Pulsar cluster in minutes with StreamNative Cloud. StreamNative Cloud provides a simple, fast, and cost-effective way to run Pulsar in the public cloud.
[Blog] Pulsar Operators Tutorial Part 1: Create an Apache Pulsar Cluster on Kubernetes
[Blog] Pulsar Operators Tutorial Part 2: Manage Pulsar Custom Resources with ArgoCD
[Blog] Pulsar Operators Tutorial Part 3: Create and Deploy a Containerized Pulsar Client
[Blog] StreamNative’s Pulsar Operators Certified as Red Hat OpenShift Operators
[Blog] Introducing Pulsar Resources Operator for Kubernetes

This is some text inside of a div block.

Button Text

Yuwei Sung

YuWei Sung is a Solutions Engineer at StreamNative. His career portfolio includes urban planning/geospatial analysis, data mining/machine learning, and distributed data systems. He has been in the field (presales, postsales, and a bit of support) for about a decade (EMC, Dell, Pivotal, VMware, and Yugabyte).

Our strategies and tactics delivered right to your inbox

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Pulsar Operators Tutorial Part 4: Use kpack to Streamline the Build Process

Install and configure kpack

Build the client app

Conclusion

More on Apache Pulsar

Newsletter