Shreyansh Cloud API
Powerful AI models accessible through a simple API. Each user gets a personal API key with rate limiting.
Documentation
Authentication
All API requests must include your personal API key in the Authorization header. Each user gets a unique key when they sign up.
curl https://api.shreyansh.cloud/v3/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{ "model": "qwen/qwen3-4b-fp8", "messages": [ {"role": "user", "content": "Hello!"} ] }'
Security Notice: Never expose your API key in client-side code or public repositories. For production applications, use environment variables or secure key management systems.
Rate Limits
To ensure fair usage and service stability, we implement rate limits on API requests.
Plan | Rate Limit | Notes |
---|---|---|
Free Tier
|
5 requests per minute
|
Per API key limit |
Pro Tier
|
100 requests per minute
|
Coming soon |
Rate Limit Response
When you exceed your rate limit, you'll receive a 429 status code with the following response:
{ "error": { "message": "Rate limit exceeded", "type": "rate_limit_exceeded", "code": 429 } }
Available Models
Qwen 3 4B FP8
Model ID: qwen/qwen3-4b-fp8
A powerful 4-billion parameter model from the Qwen series with FP8 precision for efficient inference.
Llama 3.2 1B Instruct
Model ID: llama-3.2-1b-instruct
A compact 1-billion parameter instruction-tuned model from Meta's Llama series, optimized for dialogue.
List All Models
You can retrieve a list of all available models using the following endpoint:
curl "https://api.shreyansh.cloud/v3/models" \ -H "Authorization: Bearer YOUR_API_KEY"
Chat Completions
The chat completions endpoint allows you to have conversations with the AI models. Send a series of messages and receive a model-generated response.
Endpoint
POST https://api.shreyansh.cloud/v3/chat/completions
Request Format
{ "model": "model-id", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ], "temperature": 0.7, "max_tokens": 500 }
Parameters
- model: The ID of the model to use (e.g., "qwen/qwen3-4b-fp8")
- messages: An array of message objects with "role" and "content"
- temperature: Controls randomness (0.0 to 1.0, default 0.7)
- max_tokens: Maximum number of tokens to generate
Curl Examples
Basic Example with Qwen
curl "https://api.shreyansh.cloud/v3/chat/completions" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen/qwen3-4b-fp8", "messages": [ {"role": "user", "content": "How are you?"} ] }'
Example Response
{ "id": "a208bad9fd4bda74b4e4815067a2818d", "object": "chat.completion", "created": 1757162710, "model": "qwen/qwen3-4b-fp8", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "I'm just a virtual assistant, so I don't have feelings, but I'm here and ready to help! How can I assist you today? 😊" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 12, "completion_tokens": 191, "total_tokens": 203 } }
Conversation Example with Llama
curl "https://api.shreyansh.cloud/v3/chat/completions" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "llama-3.2-1b-instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant that translates English to French."}, {"role": "user", "content": "Translate the following English text to French: Hello, how are you?"} ], "temperature": 0.3, "max_tokens": 100 }'
Advanced Example with Parameters
curl "https://api.shreyansh.cloud/v3/chat/completions" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen/qwen3-4b-fp8", "messages": [ {"role": "system", "content": "You are a knowledgeable science tutor."}, {"role": "user", "content": "Explain quantum computing in simple terms."} ], "temperature": 0.5, "max_tokens": 500, "top_p": 0.9, "frequency_penalty": 0.2, "presence_penalty": 0.3 }'
Error Handling
The API uses standard HTTP status codes to indicate the success or failure of a request.
Status Code | Error Type | Description |
---|---|---|
400 | Bad Request | Invalid request parameters |
401 | Unauthorized | Invalid or missing API key |
404 | Not Found | Requested resource not found |
429 | Too Many Requests | Rate limit exceeded |
500 | Internal Server Error | Something went wrong on our end |