eks-config.yaml
apiVersion : eksctl.io / v1alpha5
kind : ClusterConfig
iam :
withOIDC : true
metadata :
name : basic - cluster
region : ap - south - 1
version : "1.27"
managedNodeGroups :
- name : ng - dedicated - 1
instanceType : t3a.xlarge
desiredCapacity : 4
spot : true
labels :
role : spot
ssh :
allow : true # will use ~/ .ssh / id_rsa.pub as the default ssh key
iam :
withAddonPolicies :
autoScaler : true
awsLoadBalancerController : true
certManager : true
externalDNS : true
ebs : true
Create the Cluster
eksctl create cluster - f eks - config.yaml
Install KServe with KNative and ISTIO
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.11.0/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.11.0/serving-core.yaml
kubectl apply -l knative.dev/crd-install=true -f https://github.com/knative/net-istio/releases/download/knative-v1.11.0/istio.yaml
kubectl apply -f https://github.com/knative/net-istio/releases/download/knative-v1.11.0/istio.yaml
kubectl apply -f https://github.com/knative/net-istio/releases/download/knative-v1.11.0/net-istio.yaml
kubectl patch configmap/config-domain \
--namespace knative-serving \
--type merge \
--patch '{"data":{"emlo.tsai":""}}'
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.11.0/serving-hpa.yaml
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.2/cert-manager.yaml
Wait for cert manager pods to be ready
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.2/kserve.yaml
Wait for KServe Controller Manager to be ready
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.2/kserve-runtimes.yaml
Create EBS Controller
eksctl create iamserviceaccount \
--name ebs-csi-controller-sa \
--namespace kube-system \
--cluster basic-cluster \
--role-name AmazonEKS_EBS_CSI_DriverRole \
--role-only \
--attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
--approve \
--region ap-south-1
eksctl create addon -- name aws - ebs - csi - driver -- cluster basic - cluster -- service - account - role - arn arn : aws : iam :: 006547668672 :role / AmazonEKS_EBS_CSI_DriverRole -- region ap - south - 1 -- force
Create the Storage Controller
sc.yaml
apiVersion : storage.k8s.io / v1
kind : StorageClass
metadata :
name : standard
provisioner : ebs.csi.aws.com
volumeBindingMode : WaitForFirstConsumer
k apply - f sc.yaml
`
Create S3 Service Account
s3.yaml
apiVersion : v1
kind : Secret
metadata :
name : s3creds
annotations :
serving.kserve.io / s3 - endpoint : s3.ap - south - 1.amazonaws.com # replace with your s3 endpoint e.g minio - service.kubeflow: 9000
serving.kserve.io / s3 - usehttps : "1" # by default 1 , if testing with minio you can set to 0
serving.kserve.io / s3 - region : "ap-south-1"
serving.kserve.io / s3 - useanoncredential : "false" # omitting this is the same as false , if true will ignore provided credential and use anonymous credentials
type : Opaque
stringData : # use `stringData` for raw credential string or `data` for base64 encoded string
AWS_ACCESS_KEY_ID : AxxxxQxxxxxxxxY2xxx
AWS_SECRET_ACCESS_KEY : "C/dGcccuAxxxxxxxx25mxxxxxxx"
---
apiVersion : v1
kind : ServiceAccount
metadata :
name : s3 - read - only
secrets :
- name : s3creds
Kubernetes Deployment Strategies
Canary Deployment using NGINX: https://kubernetes.github.io/ingress-nginx/examples/canary/
Canary Deployment using ISTIO: https://istio.io/v1.10/blog/2017/0.1-canary/
KServe Canary Deployment
Overview:
Canary rollouts in Kubernetes are supported by KServe for inference services.
Enables deploying a new version of an InferenceService to receive a percentage of traffic.
Configurable Canary Rollout Strategy:
KServe supports a customizable canary rollout strategy with multiple steps.
Rollout strategy includes provisions for rollback to the previous revision if a step fails.
Automatic Tracking:
KServe automatically tracks the last good revision rolled out with 100% traffic.
The canaryTrafficPercent
field in the component's spec sets the traffic percentage for the new revision.
Traffic Splitting:
During canary rollout, traffic is split between the last good revision and the new revision based on canaryTrafficPercent
.
The first revision deployed receives 100% traffic.
In subsequent steps, if 10% traffic is configured for the new revision, 90% goes to the LatestRolledoutRevision.
Handling Unhealthy Revisions:
If a revision is unhealthy or bad, traffic is not routed to it.
In case of a rollback, 100% traffic is directed to the previous healthy revision, the PreviousRolledoutRevision.
Rollout Steps:
Step 1: Deploy the first revision, receives 100% traffic.
Step 2: Deploy multiple revisions, route a configured percentage to the new revision.
Step 3: Promote the LatestReadyRevision to the LatestRolledoutRevision, receiving 100% traffic and completing the rollout.
Note: Canary deployments allow controlled testing of new versions before full deployment, minimizing risks and ensuring a smooth transition.
vit-classifier.yaml
apiVersion : "serving.kserve.io/v1beta1"
kind : "InferenceService"
metadata :
name : "imagenet-vit"
spec :
predictor :
serviceAccountName : s3 - read - only
model :
modelFormat :
name : pytorch
storageUri : s3 : //tsai-emlo/kserve-ig-2/imagenet-vit/
resources :
limits :
cpu : 2600m
memory : 4Gi
This is our usual KServe deployment, nothing fancy
❯ kg isvc
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
imagenet - vit http : //imagenet-vit.default.emlo.tsai True 100 imagenet-vit-predictor-00001 3m12s
Now let’s deploy the cat classifier model as the new canary candidate
We’ll modify the cat-classifier to advertise it’s model name as vit-classifier
aws s3 cp -- recursive s3 : //tsai-emlo/kserve-ig-2/cat-classifier/ cat-classifier/
inference_address = http : //0.0.0.0:8085
management_address = http : //0.0.0.0:8085
metrics_address = http : //0.0.0.0:8082
grpc_inference_port = 7070
grpc_management_port = 7071
enable_envvars_config = true
install_py_dep_per_model = true
load_models = all
max_response_size = 655350000
model_store =/ mnt / models / model - store
default_response_timeout = 600
enable_metrics_api = true
metrics_format = prometheus
number_of_netty_threads = 4
job_queue_size = 10
model_snapshot = { "name" : "startup.cfg" , "modelCount" : 1 , "models" :{ "vit-classifier" :{ "1.0" :{ "defaultVersion" : true , "marName" : "cat-classifier.mar" , "minWorkers" : 1 , "maxWorkers" : 1 , "batchSize" : 1 , "maxBatchDelay" : 100 , "responseTimeout" : 600 }}}}
aws s3 cp cat - classifier / config / config.properties s3 : //tsai-emlo/kserve-ig-2/cat-classifier/config/config.properties
https://base64.guru/converter/encode/image
Convert this to base64
http://images.cocodataset.org/val2017/000000039769.jpg
input.json
{
"instances" : [
{
"data" : "BASE64 IMAGE"
}
]
}
import requests
import json
url = "http://abd5bc101d01a4478baa58570709a6f6-1419869784.ap-south-1.elb.amazonaws.com/v1/models/imagenet-vit:predict"
with open( "input.json" ) as f:
payload = json.load(f)
headers = {
'Host': 'imagenet-vit. default .emlo.tsai',
'Content-Type': 'application/json'
}
response = requests.request( "POST" , url, headers=headers, json=payload)
print(response.text)
{
"predictions" : [
{
"class" : "Egyptian cat" ,
"probability" : 0.9374412894248962
}
]
}
Prometheus & Grafana
Install Prometheus
https://github.com/kserve/kserve/blob/master/docs/samples/metrics-and-monitoring/README.md
Install Kustomize on your system
https://kubectl.docs.kubernetes.io/installation/kustomize/
git clone https : //github.com/kserve/kserve
cd kserve
kustomize build docs / samples / metrics - and - monitoring / prometheus - operator | kubectl apply - f -
kubectl wait -- for condition = established -- timeout = 120s crd / prometheuses.monitoring.coreos.com
kubectl wait -- for condition = established -- timeout = 120s crd / servicemonitors.monitoring.coreos.com
kustomize build docs / samples / metrics - and - monitoring / prometheus | kubectl apply - f -
Test if Prometheus is working
kubectl port - forward service / prometheus - operated - n kfserving - monitoring 9090 : 9090
We need to patch KServe’s Logging to log all Prometheus Metrics
If an InferenceService uses Knative, then it has at least two containers in one pod, queue-proxy
and kserve-container
. A limitation of using Prometheus is that it supports scraping only one endpoint in the pod. When there are multiple containers in a pod that emit Prometheus metrics, this becomes an issue (see Prometheus for multiple port annotations issue #3756 for the full discussion on this topic). In an attempt to make an easy-to-use solution, the queue-proxy is extended to handle this use case.
https://github.com/kserve/kserve/blob/master/qpext/README.md
qpext_image_patch.yaml
data :
queue - sidecar - image : kserve / qpext :latest
kubectl patch configmaps - n knative - serving config - deployment -- patch - file qpext_image_patch.yaml
NOTE: You will need to delete your deployment and redeploy to use the new qpext image we just patched
Install Grafana
helm install grafana grafana / grafana
NOTE: would be a good idea to move it to another namespace
Fetch the admin
user’s password from secrets
kubectl get secret --namespace default grafana - o jsonpath = "{.data.admin-password}" | base64 -- decode ; echo
Port forward Grafana
kubectl port - forward svc / grafana 3000 : 80
Add a New Data Source in Grafana
Add this as the server url
http : //prometheus-operated.kfserving-monitoring.svc.cluster.local:9090
Add this Dashboard
https://grafana.com/grafana/dashboards/18032-knative-serving-revision-http-requests/
Now add some load to the model
send.py
import requests
import json
url = "http://abd5bc101d01a4478baa58570709a6f6-1419869784.ap-south-1.elb.amazonaws.com/v1/models/imagenet-vit:predict"
with open( "input.json" ) as f:
payload = json.load(f)
headers = {
'Host': 'imagenet-vit. default .emlo.tsai',
'Content-Type': 'application/json'
}
response = requests.request( "POST" , url, headers=headers, json=payload)
print(response.headers)
print(response.status_code)
print(response.json())
You should start seeing some stats about the requests
Canary Deployment
Now we are ready for a Canary Deployment and Observability with Prometheus & Grafana
vit-classifier.yaml
apiVersion : "serving.kserve.io/v1beta1"
kind : "InferenceService"
metadata :
name : "imagenet-vit"
annotations :
serving.kserve.io / enable - metric - aggregation : "true"
serving.kserve.io / enable - prometheus - scraping : "true"
spec :
predictor :
canaryTrafficPercent : 30
serviceAccountName : s3 - read - only
model :
modelFormat :
name : pytorch
# storageUri : s3 : //tsai-emlo/kserve-ig-2/imagenet-vit/
storageUri : s3 : //tsai-emlo/kserve-ig-2/cat-classifier/
resources :
limits :
cpu : 2600m
memory : 4Gi
❯ kg isvc
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
imagenet - vit http : //imagenet-vit.default.emlo.tsai True 70 30 imagenet-vit-predictor-00001 imagenet-vit-predictor-00002 73m
Start sending requests
for i in { 1..200 }; do python send.py ; done
Lot of 4xx
Errors!
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '56' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:12:35 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '1' }
404
{ 'error' : 'Model with name imagenet-vit does not exist.' }
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:12:35 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '750' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:12:36 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '764' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:12:37 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '751' }
200
This is because the mar
file for cat-classifier still advertises the model name to be cat-classifier
Time to Rollback!
apiVersion : "serving.kserve.io/v1beta1"
kind : "InferenceService"
metadata :
name : "imagenet-vit"
annotations :
serving.kserve.io / enable - metric - aggregation : "true"
serving.kserve.io / enable - prometheus - scraping : "true"
spec :
predictor :
canaryTrafficPercent : 0
serviceAccountName : s3 - read - only
model :
modelFormat :
name : pytorch
# storageUri : s3 : //tsai-emlo/kserve-ig-2/imagenet-vit/
storageUri : s3 : //tsai-emlo/kserve-ig-2/cat-classifier/
resources :
limits :
cpu : 2600m
memory : 4Gi
❯ kg isvc
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
imagenet - vit http : //imagenet-vit.default.emlo.tsai True 100 0 imagenet-vit-predictor-00001 imagenet-vit-predictor-00002 89m
Now the PREV model will go to 100% and the new model will go down to 0% traffic
All of our requests are now going through
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:43 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '751' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:44 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '754' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:45 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '765' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:46 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '754' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:47 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '766' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:48 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '743' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:49 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '781' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:50 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '744' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:51 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '768' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:18:52 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '755' }
200
We can fix the .mar file by recreating cat-classifier .mar but with model name as imagenet-vit
After fixing
Some of the responses are Bengal
Cat !
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:51 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '745' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:52 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '744' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:53 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '761' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:55 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '848' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:55 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '755' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:57 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '792' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:58 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '837' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:40:59 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '816' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:00 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '743' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:01 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '772' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:02 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '782' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:03 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '860' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:04 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '746' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:05 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '778' }
200
{ 'predictions' : [{ 'class' : 'Egyptian cat' , 'probability' : 0.9374412894248962 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:06 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '899' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '75' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:41:07 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '742' }
200
No More Errors! 🪄
We can promote our new model by just removing canaryTrafficPercent
apiVersion : "serving.kserve.io/v1beta1"
kind : "InferenceService"
metadata :
name : "imagenet-vit"
annotations :
serving.kserve.io / enable - metric - aggregation : "true"
serving.kserve.io / enable - prometheus - scraping : "true"
spec :
predictor :
serviceAccountName : s3 - read - only
model :
modelFormat :
name : pytorch
# storageUri : s3 : //tsai-emlo/kserve-ig-2/imagenet-vit/
storageUri : s3 : //tsai-emlo/kserve-ig-2/cat-classifier/
resources :
limits :
cpu : 2600m
memory : 4Gi
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:27 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '779' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:28 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '795' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:29 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '767' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:30 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '778' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:32 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '789' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:33 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '798' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:34 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '781' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:35 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '767' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:36 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '922' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:37 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '775' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:38 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '774' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:39 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '787' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69' , 'content-type' : 'application/json' , 'date' : 'Thu, 23 Nov 2023 19:44:40 GMT' , 'server' : 'istio-envoy' , 'x-envoy-upstream-service-time' : '825' }
200
{ 'predictions' : [{ 'class' : 'Bengal' , 'probability' : 0.5737159252166748 }]}
{ 'content-length' : '69'
Look at the Request Volume by Revision, our canary model is now promoted and older version of model is getting no requests
NOTES: Route Traffic using Tags: https://kserve.github.io/website/0.11/modelserving/v1beta1/rollout/canary-example/#route-traffic-using-a-tag